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Abstract 

In Bhatt and Roy's minimal directed spanning tree (MDST) construction for a 
random partially ordered set of points in the unit square, all edges must respect the 
"coordinatewise" partial order and there must be a directed path from each vertex 
to a minimal element. We study the asymptotic behaviour of the total length of 
this graph with power weighted edges. The limiting distribution is given by the sum 
of a normal component away from the boundary and a contribution introduced by 
the boundary effects, which can be characterized by a fixed point equation, and is 
reminiscent of limits arising in the probabilistic analysis of certain algorithms. As 
the exponent of the power weighting increases, the distribution undergoes a phase 
transition from the normal contribution being dominant to the boundary effects 
dominating. In the critical case where the weight is simple Euclidean length, both 
effects contribute significantly to the limit law. We also give a law of large numbers 
for the total weight of the graph. 

Key words and phrases: Spanning tree; nearest neighbour graph; weak conver- 
gence; fixed-point equation; phase transition; fragmentation process. 



1 Introduction 

Recent interest in graphs, generated over random point sets consisting of indepen- 
dent uniform points in the unit square by connecting nearby points according to 
some deterministic rule, has been considerable. Such graphs include the geometric 
graph, the nearest neighbour graph and the minimal-length spanning tree. Many 
aspects of the large-sample asymptotic theory for such graphs, when they are locally 
determined in a certain sense, are by now quite well understood. See for example 

li muzi nHi Ha m . 
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One such graph is the minimal directed spanning tree (or MDST for short), 
which was introduced by Bhatt and Roy in [Sj. In the MDST, each point x of a 
finite (random) subset S of (0, 1]^ is connected by a directed edge to the nearest 
y G 5 U {(0,0)} such that y 7^ x and y ^* x, where y ^* x means that each 
component of x — y is nonnegative. See Figure ^ for a reahsation of the MDST on 
simulated random points. 

Motivation comes from the modelhng of communications or drainage networks 
(see |H1 1161 on] )- For example, consider the problem of designing a set of canals to 
connect a set of hubs, so as to minimize their total length subject to a constraint 
that all canals must flow downhill. The mathematical formulation given above for 
this constraint can lead to significant boundary effects due to the possibility of 
long edges occurring near the lower and left boundaries of the unit square; these 
boundary effects distinguish the MDST qualitatively from the standard minimal 
spanning tree and the nearest neighbour graph for point sets in the plane. Another 
difference is the fact that there is no uniform upper bound on vertex degrees in the 
MDST. 

In the present work, we consider the total length of the MDST on random points 
in (0, 1]^, as the number of points becomes large. We also consider the total length 
of the minimal directed spanning forest (MDSF), which is the MDST with edges 
incident to the origin removed (see Figure H for an example). In [H], Bhatt and Roy 
mention that the total length is an object of considerable interest, although they 
restrict their analysis to the length of the edges joined to the origin (subsequently 
also examined in ^Hl)- A first order result for the total length of the MDST or 
MDSF is a law of large numbers; we derive this in Theorem 12.11 for a family of 
MDSFs indexed by partial orderings on R^, which include ^* as a special case. 

This paper is mainly concerned with establishing second order results, i.e., weak 
convergence results for the distribution of the total length, suitably centred and 
scaled. For the length of edges from points in the region away from the boundary, 
we prove a central limit theorem. The boundary effects are significant, and near 
the boundary the MDST can be described in terms of a one-dimensional, on-line 
version of the MDST which we call the directed linear tree (DLT), and which we 
examine in Section |31 In the DLT, each point in a sequence of independent uniform 
random points in an interval is joined to its nearest neighbour to the left, amongst 
those points arriving earlier in the sequence. This DLT is of separate interest in 
relation to, for example, network modelling and molecular fragmentation (see [H], 
[1], and references therein). 

In Theorem 13.11 we establish that the limiting distribution of the centred total 
length of the DLT is characterized by a distributional fixed-point equation, which re- 
sembles those encountered in the probabilistic analysis of algorithms such as Quick- 
sort [Zj. Such fixed-point distributional equalities, and the so-called 'divide and 
conquer' or recursive algorithms from which they arise, have received considerable 
attention recently; see, for example, |H1E1ISE2]- 

We consider power- weighted edges. Our weak convergence results (Theorem 
I2.2|l demonstrate that, depending on the value chosen for the weight exponent of 
the edges, there are two regimes in which either the boundary effects dominate 
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or those edges away from the boundary are dominant, and that there is a critical 
value (when we take simple Euclidean length as the weight) for which neither effect 
dominates. 

In the related paper ^^l, we give results dealing with the weight of the edges 
joined to the origin, including weak convergence results, in which the limiting distri- 
butions are given in terms of some generalized Dickman distributions. Subsequently, 
it has been shown [2] that this two dimensional case is rather special - in higher di- 
mensions the corresponding limits are normally distributed. jl6j also deals with the 
maximum edge length of the MDST (the maximum length of those edges incident 
to the origin was dealt with in 

In the next section we give formal definitions of the MDST and MDSF, and state 
our main results fTheorems l2.1l and l2.2() on the total length of the MDST and MDSF. 
The results on the DLT which we present in Section 121 and the general central limit 
theorems which we present in Section [IJ are of some independent interest. 




Figure 1: Realizations of the MDSF (left) and MDST on 100 simulated random points in 
the unit square, under the partial ordering =^*. 



2 Definitions and main results 

We work in the same framework as |16j . Here we briefly recall the relevant termi- 
nology. See for more detail. 

Suppose y is a finite set endowed with a partial ordering A minimal element, 
or sink, of 1^ is a vertex vq £ V for which there exists no v £ V \ {vq} such that 
V ^ vq. Let Vq denote the set of all sinks of V. 

The partial ordering induces a directed graph G = {V, E), with vertex set V and 
with edge set E consisting of all ordered pairs {v, u) of distinct elements of V such 
that u ^v. A directed spanning forest (DSF) on 1/ is a subgraph T = {Vt,Et) of 
(y, E) such that (i) Vr = V and Et Q E, and (h) for each vertex v £ V\Vo there 
exists a unique directed path in T that starts at v and ends at some sink u £ Vq. In 
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the case where Vq consists of a single sink, we refer to any DSF on 1/ as a directed 
spanning tree (DST) on V . If we ignore the orientation of edges then a DSF on 
V is indeed a forest and, if there is just one sink, then any DST on F is a tree. 

Suppose the directed graph iy.,E) carries a weight function on its edges, i.e., 
a function w : E ^ [0, oo). If T is a DSF on V, we set w{T) := X^cgBt ^ 
minimal directed spanning forest (MDSF) on V (or, equivalently, on G), is a directed 
spanning forest T on V such that 'w{T) < w{T') for every DSF T' on V. If V has 
a single sink, then a minimal directed spanning forest on V is called a minimal 
directed spanning tree (MDST) on V . 

For t; G y, we say that u \ {f } is a directed nearest neighbour oi v \i u ^ v 
and w{v,u) < w{v,u') for all u' £ V \ {v} such that n' ^ v. For each u G y \ Vq, 
let n^, denote a directed nearest neighbour of v (chosen arbitrarily if v has more 
than one directed nearest neighbour). Then ^2] the subgraph {V,Em) of {V,E), 
obtained by taking Eaj := {{v,n^) : v £ V \ Vq}, is a MDSF of V. Thus, if all 
edge- weights are distinct, the MDSF is unique, and is obtained by connecting each 
non-minimal vertex to its directed nearest neighbour. 

For what follows, we consider a general type of partial ordering of R^, denoted 

4 , specified by the angles 9 G [0, 2tt) and G (0, vr] U {2tt}. For x G R^, let Ce,0(x) 
be the closed cone with vertex x and boundaries given by the rays from x at angles 
6 and 6 + (p, measuring anticlockwise from the upwards vertical. The partial order 
is such that, for xi,X2 G R^, 

Xi ^ X2 iff Xi G C6I,0(X2). (1) 

7r/2,7r/2 

We shall use ^* as shorthand for the special case ^ , which is of particular 
interest, as in jH]. In this case u ^* v ioi u = {ui,U2),v = (^1,^2) G -E if and only 
if ui < vi and U2 < V2- The symbol ^ will denote a general partial order on R^. 

We do not permit here the case = 0, which would almost surely give us 
a disconnected point set. Nor do we allow tt < (j) < 27r, since in this case the 
directional relation is not a partial order, since the transitivity property (if 
u ^ V and v ^ w then u =4 "w) fails for ir < (p < 27r. We shall, however, allow the 
case (j) = 2tt which leads to the standard nearest neighbour (directed) graph. 

The weight function is given by power- weighted Euclidean distance, i.e., for 
{u,v) £ E we assign weight w{u,v) = \\u — v\\°' to the edge {u,v), where || • || denotes 
the Euclidean norm on R^, and a > is an arbitrary fixed parameter. Thus, when 
a = 1 the weight of an edge is simply its Euclidean length. Moreover, we shall 
assume that V C R^ is given hy V = S or V = := 5 U {0}, where is the 
origin in R^ and S is generated in a random manner. The random point set S will 
usually be either the set of points given by a homogeneous Poisson point process Vn 
of intensity n on the unit square (0, 1]^, or a binomial point process Xn consisting 
of n independent uniformly distributed points on (0, 1]^. 

Note that in this random setting, each point of S almost surely has a unique 
directed nearest neighbour, so that V has a unique MDSF, which does not depend 
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on the choice of a. Denote by £"(5) the total weight of all the edges in the MDSF 
on 5, and let 0^(8) := £"(5) - £;[£"(cS)], the centred total weight. 

Our first result presents laws of large numbers for the total edge weight for the 

9,<t> 

general partial order ^ and general < a < 2. We state the result for n points 
uniformly distributed on (0, 1]^, but the proof carries through to other distributions 
(see the start of Section inj. 

Theorem 2.1 Suppose < a < 2. Under the general partial order =4 , with < 
9 <2tt and < (j) < tt or (p = 2tt, it is the case that 

n("/2)-i£a(;t'„) (2/(/.r/2r(l + a/2), as n ^ oo. (2) 

Also, when the partial order is ^* , ^ remains true with the addition of the origin, 
i.e. with Xn replaced by X^. 

Remark. In the special case a = 1, the hmit in @ is y^7r/(2(/)). This limit is 1 
when (j) = 7r/2. Also, for (/> = 27r we have the standard nearest neighbour (directed) 
graph (that is, every point is joined to its nearest neighbour by a directed edge), 
and this limit is then 1/2. This result (for a = l,(/> = 27r) is stated without proof 
(and attributed to Miles jT^]) in P, but we have not previously seen the limiting 
constant derived explicitly, either in jl2j or anywhere else. 

Our main result (Theorem 12. 2|) presents convergence in distribution for the case 
where the partial order is the limiting distributions are of a different type in 
the three cases a = 1 (the same situation as (HI), < a < 1, and a > 1. We define 
these limiting distributions in Theorem 12.21 in terms of distributional fixed-point 
equations. These fixed-point equations are of the form 

k 

X = Y^ ArX^''^ + B, (3) 

r=l 

where A; G N, X^^'\ r = 1, . . . ,k, are independent copies of the random variable X, 
and {Ai, . . . , A^, B) is a random vector, independent of {X^^\ . . . , X^'^'^), satisfying 
the conditions 

k 

\Ar\^ < 1, E[B] = 0, E[B^] < oo. (4) 

r=l 

Theorem 3 of Rosier |2J (proved using the contraction mapping theorem; see also 
|13l I22j l says that if ^ holds, there is a unique square-integrable distribution with 
mean zero satisfying the fixed-point equation Q , and this will guarantee uniqueness 
of solutions to all the distributional fixed-point equalities considered in the sequel. 

Define the random variable Di , to have the distribution that is the unique solu- 
tion to the distributional fixed-point equation 

Di = Ud\^^ + (1 - C/)Z)f ^ + [/ log [/ + (1 - [/) log(l -U) + U, (5) 
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where U is uniform on (0, 1) and independent of the other variables on the right. We 
shall see later (in Propositions 13.51 and l,3.6j) that E[Di] = and Var[I)i] = 2 — 7r^/6; 
higher order moments are given recursively by eqn dJ. 

For a > 1, let Da denote a random variable with distribution characterized by 
the fixed-point equation 

Da = U'^Di'^ + (1 - Uroi^^ + + ^(1 - Ur - (6) 

a — I a — I a — i 

where again U is uniform on (0, 1) and independent of the other variables on the 
right. Also for q > 1, let denote a random variable with distribution character- 
ized by the fixed-point equation 

Fa ^ U-Fa + (1 - UrDa + + ^^^^ - (7) 

a[a — 1) a — 1 a{a — 1) 

where U is uniform on (0, 1), Da has the distribution given by ©, and the U, Da 
and Fa on the right are independent. In Section El we shall see that the random 
variables Da, Fa for a > 1 arise as centred versions of random variables (denoted 
Da, Fa respectively) satisfying somewhat simpler fixed point equations. Thus Da 
and Fa both have mean zero; their variances are given by eqns (jHEl) and (|4()|) below. 
Let A^(0, s^) denote the normal distribution with mean zero and variance s^. 

Theorem 2.2 Suppose the weight exponent is a > and the partial order is 
There exist constants < < such that, for normal random variables Ya ~ 
Af{0,sl) and W„ ~ AA(0, ).- 
(i) As n ^ oo, 

^^(a-i)/2^a(p0) ^ and n^^'-^^/^C^iX^J ^Wa (0<a<l); (8) 
C\v'J^Dl'^ + Dl'^+Y^ and C^X^) ^ dI'^ + dI'^ + Wr, (9) 

Here all the random variables in the limits are independent, and DX , i = 1,2 are 
independent copies of the random variable Da defined at ^ for a = 1 and for 
a> 1. 

(a) As n — > oo, 

^(a-i)/2^a(p^) ^ and n(°-i)/2£a(;f^) ^Wa (0 < a < 1); (11) 
£}{Vn)^D\^^ +dT^ + Yi and £} {Xn) ^ d\'^ + + Wr, (12) 
£-(p„)^Fm+Fm c^^Xn)^FP +Fi^^ (a>l) . (13) 

Here all the random variables in the limits are independent, and D\ , i = 1,2, 
are independent copies of Di with distribution defined at |3|), and for a > 1, Fa\ 
i = 1,2, are independent copies of Fa with distribution defined at Q). 
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Remarks. The normal random variables or Wa arise from the edges away from 
the boundary (see Sectional). The non-normal variables (the Ds and Fs) arise from 
the edges very close to the boundary, where the MDSF is asymptotically close to 
the 'directed linear forest' discussed in Section 13 

Theorem 12.21 indicates a phase transition in the character of the limit law as 
a increases. The normal contribution (from the points away from the boundary) 
dominates for < a < 1, while the boundary contributions dominate for a > 1. 
In the critical case a = 1, neither effect dominates and both terms contribute 
significantly to the asymptotic behaviour. 

Noteworthy in the case a = 1 is the fact that by © and (|12() . the limiting 
distribution is the same for C^{Vn) as for £^(P°), and the same for C^{Xn) as for 
C^{X^). Note, however, that the difference C^{Vn) — ^^i'Pn) is the (centred) total 
length of edges incident to the origin, which is not negligible, but itself converges 
in distribution (see JH]) to a non-degenerate random variable, namely a centred 
generalized Dickman random variable with parameter 2 (see (|28|) below). As an 
extension of Theorem 12. 2( it should be possible to show that the joint distribution 
of {C^ (Vn) , {V^)) converges to that of two coupled random variables, both having 
the distribution of Di , whose difference has the centred generalized Dickman distri- 
bution with parameter 2. Likewise for the joint distribution of [C^ (Xn) , (X^)) ■ 

Of particular interest is the distribution of the variable Di appearing in Theorem 
12.21 In Section 13.41 we give a plot (Figure [21) of the probability density function of 
this distribution, estimated by simulation. Also, we can use the fixed-point equation 
© to calculate the moments of Di recursively. Writing 

/([/) := [/ log [/ + (1 - U) log(l -U) + U, 

and setting nik := E[D^], we obtain 




The fact that mi = simplifies things a little, and we can rewrite this as 

m,E[{f{U))'^-\U' + {l-Uy)] 

j=2 

So, for example, when A: = 3 we obtain ~ 0.15411, which shows Di is not Gaus- 
sian and is consistent with the skewness of the plot in Figure |2I 

The remainder of this paper is organized as follows. After discussion of the DLT 
in Section 121 in Section [^ we present general limit theorems in geometric probability, 
which we shall use in obtaining our main results for the MDST. Theorem 12.11 is 
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proved in Section [S] (this proof does not use the results of Section |2J . The proof of 
Theorem 12.21 is prepared in Sections El and and completed in Section |H1 In these 
proofs, we repeatedly use Slutsky's theorem (see e.g. |14j l which says that if X„ — > X 
in distribution and — > in probability, then X„ + X in distribution. 

3 The directed linear forest and tree 

The directed linear forest (DLF) and directed linear tree (DLT) are for us a tool 
for the analysis of the limiting behaviour of the contribution to the total weight of 
the random MDSF/MDST from edges near the boundary of the unit square. In 
the present section we derive the properties of the DLF that we need (in particular. 
Theorem 13. subsequently, in Theorem 17.11 we shall see that the total weight of 
edges from the points near the boundaries, as n — > oo, converges in distribution to 
the limit of the total weight of the DLF. 

The DLT is also of some intrinsic interest. It is a one-dimensional directed 
analogue of the so-called 'on-line nearest neighbour graph', which is of interest 
in the study of networks such as the world wide web (see, e.g. and for 
more on the on-line nearest neighbour graph). Moreover, it is constructed via a 
fragmentation process similar to those seen in, for example, j^; the tree provides a 
historical representation of the fragmentation process. 

For any finite sequence %n = {xi,X2, ■ ■ ■ , Xm) G (0, 1]"^, we construct the directed 
linear forest (DLF) as follows. We start with the unit interval (0, 1] and insert the 
points Xi in order, one at a time, starting with i = 1. At the insertion of each 
point, we join the new point to its nearest neighbour among those points already 
present that lie to the left of the point (provided that such a point exists). In 
other words, for each point Xi, i > 2, we join Xi by a directed edge to the point 
maxjxj : 1 < j < i, xj < xi}. If {xj : I < j < i, Xj < Xi} is empty, we do not add 
any directed edge from Xi. In this way we construct a 'directed linear forest', which 
we denote by DLF (7^). We denote the total weight (under weight function with 
exponent a) of DLF (Tm) by D°'{Tm), that is, we set 

m 

{%n) := ^^(^^i — maxjxj : 1 < j < i, xj < Xi})"l{mm{xj : I < j < i} < Xj}. 

i=2 

Further, given %n, let 7^ be the sequence {xq, xi, . . . , Xm) where the initial term 
is Xo := 0. Then the DLF on 7^ is constructed in the same way, where now for 
each i > 1, we join Xj by an edge to the point max{xj : < j < i, Xj < Xj}. But 
now we see that xi will always be joined to xq = 0, and X2 will be joined either to 
xi (if X2 > xi) or to Xq, and so on. In this way we construct a 'directed linear tree' 
(DLT) on vertex set {xq, xi, . . . , x^} with m edges. Denote the total weight of this 
tree with weight exponent a by D"(7^); that is, set 

m 

(T^) := ^(xj - max{xj : < j < z,Xj < xJ)". 
1=1 
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We shall be mainly interested in the case where 7^ is a random vector in (0, 1]™. 
In this case, set Z)" {T^) := (T^) - E [D°' (T^)] the centred total weight of the 
DLF, and (T^) = D° (T^) - E [D° (T^)] the centred total weight of the DLT. 

We take %n to be a vector of uniform variables. Let (Xi, X2, X^, . . .) be a 
sequence of independent uniformly distributed random variables in (0,1], and for 
m e N set Um ■■= {Xi, X2, . . . , Xm). We consider D°'{Um) and L'"(^/^). For these 
variables, we establish asymptotic behaviour of the mean value in Propositions 13.11 
and 13. 21 along with the following convergence results, which are the principal results 
of this section. 

For Q > 1, let Da denote a random variable with distribution characterized by 
the fixed-point equation 

Da = W'Di^^ + (1 - U^Di^^ + (15) 

where U is uniform on (0, 1) and independent of the other variables on the right. 
Also for a > 1, let denote a random variable with distribution characterized by 
the fixed-point equation 

Fa = U'^Fa + (1 - UrDa, (16) 

where U is uniform on (0, 1), Da has the distribution given by (P3|) . and the U, Da 
and Fa on the right are independent. The corresponding centred random variables 
Da ■= Da — E[Da] and Fa := Fa — E[Fa] satisfy the fixed-point equations © and 
((TJ respectively. The solutions to © and ((T)) are unique by the criterion given at 
(jU, and hence the solutions to (dU and (HH) are also unique. 

Theorem 3.1 (i) As m ^ oo we have D^{U^) Di and D^{Um) 

where Di has the distribution given by the fixed-point equation |3|), and Fi has 
the same distribution as Di. Also, the variance of Di (and hence also of Fi) 
is 2- 7r76 ^ 0.355066. Finally, Cov(£)i, Fi) = (7/4) - tt'^/G w 0.105066. 

(a) For a > 1, as m ^ oo we have D°'{U^) Da, almost surely and in L^, 

and D'^iJAm) — > Fa, almost surely and in , where the distributions of Da, 
Fa are given by the fixed-point equations il5\) and Ub]) respectively. Also, 
E[Da] = {a — l)~^ and E[Fa] = {a{a — l))~^, while Va,v{Da) and\ai{Fa) are 
given by and respectively. 

Proof. Part (i) follows from Propositions 13.51 13.61 and 13.71 below. Part (ii) follows 
from Propositions 13.31 and 13.41 below. We prove these results in the following sec- 
tions. □ 

An interesting property of the DLT, which we use in establishing fixed-point 
equations for limit distributions, is its self- similarity (scaling property). In terms of 
the total weight, this says that for any t G (0, 1), if Yi, . . . , Yn are independent and 
uniformly distributed on (0,t], then the distribution of -D"(Y'i, . . . ,Yn) is the same 
as that of rL>"(Xi, . . . , X„). 
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3.1 The mean total weight of the DLF and DLT 

First we consider the rooted case, i.e. the DLT on Z^^. For m = 1,2,3, .. . denote 
by Zra the random variable given by the gain in length of the tree on the addition 
of one point [Xm) to an existing m — 1 points in the DLT on a sequence of uniform 
random variables W^-i' with the conventions D^{Uq) = and Xq = 0, we set 

Zm := DHU^) - D\UI^^) = Xra. - max{X,- : < j < m, X,- < (17) 

Thus, with weight exponent a, the mth edge to be added has weight Z"^. 

Lemma 3.1 (i) Zm has distribution function Fm given by Fm.{t) = for t < 0, 
Fm{t) = 1 fort>l, and F^{t) = 1 - (1 - t)™ /or < t < 1. 
(ii) For a > 0, has expectation and variance 

^ ^jn!r(l + a^ m!r(l + 2a) / m!r(l + a) ^ 

^^^-J-r(l + « + m)' Var[ZJ - + 2a + m) " \ni + a + m)) ' ^^^^ 

In particular, 

1 771 

i5;[Z„] = ^-; Var[Z„] = (19) 
m + 1 (m+lj^(m + 2j 

('iiij For a > 0, as m ^ oo we have 

E[Z^] ~ r(a + l)m^", Var[Z^] ~ (r(2a + 1) - {T{a + l)f) m^^a, (20) 

(iv) As m — > oo, mZ^ converges in distribution, to an exponential with param- 
eter 1. 

Proof. For < t < 1 we have 

P[Zm >t] = P[Xm > t and none of Xi, . . . , Xm-i lies in {Xm - 1, X^)] = (1 - 1)"", 

and (i) follows. We then obtain (ii) since for any a > and for k = 1,2, 

m!r(l + ka) 
+ m) 



E[Zi^]= f P[Z„,>t^'^^^^]dt= /"\l-tVfc")-dt=-^^^^J%± 
Jo Jo r(l + ka 



Then (iii) follows by Stirling's formula, which yields 

E[Z^^] = r(l + fca)"i"^'°(l + 0{m-^)). 
For (iv), we have from (i) that, for t G [0, oo), and m large enough so that (t/m) < 1, 



t\ I t ^ ™ 



P\mZyn. <t\=Fm\ — \= \-\\ ^1- e"*, as m ^ oo. 

\m ) \ m J 

But 1 — e~*, t > is the exponential distribution function with parameter 1. □ 

The following result gives the asymptotic behaviour of the expected total weight 
of the DLT. Let 7 denote Euler's constant, so that 

(Et) -logA; = 7 + 0(A;-i). (21) 
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Proposition 3.1 As m ^ oo the expected total weight of the DLT under a-power 
weighting on satisfies 



E[D^{Ul)] ~ £i^±lZ^i-" (0<a<l); (22) 
\ — a 

E[D\ul)]-\ogm ^7-1; (23) 

E[D-{Ul)] = -J— + 0{m^-'^) (a>l). (24) 
a — \ 

Proof. We have 

m m 

E[D%Ul)] = {E[D%Uf)] - E[D^{Ul,)]) =Y.E[Zf]. 

i=l i=l 

In the case where a = 1, E[Zi] = {i + 1)"^ by (HH), and ^ follows by (EIJ). For 
general a > 0, a 7^ 1, from (|18j) we have that 

^ ^ ""^^ ^ ^■^r(l + a + i) a-1 (a-l)r(m + l + a) ^ ' 
By Stirling's formula, the last term satisfies 

_ r(i + o)r(„ + 2) _ rq + q)^., ^ . 

(a-l)r(m + l + a) a-1 ^ v v ; 

which tends to zero as m — > 00 for a > 1, to give us 1)24^ . For a < 1, we have H22() 
from (ESI) and (El). □ 



Now consider the unrooted case, i.e., the directed linear forest. For Um as above 
the total weight of the DLF is denoted D'^iUm), and the centred total weight is 
b''{Um) ■■= D^'iUm) - E[D'^{Ur,^)]. We then see that 

D-{Ul)=D%U,,)+C^{Ul), (27) 

where Cq{U^) is the total weight of edges incident to in the DLT on lA^. 

The following lemma says that £q (Z//^) converges to a random variable that has 
the generalized Dickman distribution with parameter 1/a (see JHl)) that is, the 
distribution of a random variable X which satisfies the distributional fixed-point 
equation 

X = U''{l + X), (28) 

where U is uniform on (0, 1) and independent of the X on the right. We recall from 
Proposition 3 of J^] that if X satisfies (|28|) then 



E[X] = 1/a, and E[X'^] = (a + 2)/{2a'^). (29) 
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Lemma 3.2 Let a > 0. There is a random variable Cq with the generalized 
Dickman distribution with parameter 1/a, such that as m — > oo, we have that 
Cq{UI^) — > Cq, almost surely and in L^. 

Proof. Let So^Um) denote the degree of the origin in the directed hnear tree on 
so that Soil^m) is number of lower records in the sequence {Xi, . . . ,Xm). 
Then 

CM,) = + {U1U2T + --- + {Ui--- Us,iuil)r, (30) 

where {Ui,U2, ■ ■ ■) is a certain sequence of independent uniform random variables on 
(0, 1), namely the ratios between successive lower records of the sequence The 
sum C/f + {U1U2)" + + • • • has nonnegative terms and finite expectation, 

so it converges almost surely to a limit which we denote £q. Then Cq has the 
generalized Dickman distribution with parameter 1/a (see Proposition 2 of 

Since SoiU^) tends to infinity almost surely as m — > c«, we have £q(Z//^) — > Cq 
almost surely. Also, EliC^f] < 00, by and - C^{U^)f < {C^f for all 
m. Thus S[(£o(Z//^) — Cq)^] — > by the dominated convergence theorem, and so 
we have the convergence as well. □ 

Proposition 3.2 As m ^ 00 the expected total weight of the DLF under a-power 
weighting on Um satisfies 

E[D"{U„,)] ~ r(a + l) ^i_„ (0^^^-^). (3^) 
1 — a 

E[D\U^)]-\ogm ^ 7-2; (32) 

E[D-{Um)] - (a>l). (33) 

Proof. By ^ we have S[L»"(Z^^)] = E[D'^{U^)] - E[C'§{U^)]. By Lemma 
and (HH), 

Eimul)] E[CZ] = 1/a. 
We then obtain (EO, (EH) and (El from Proposition EH □ 

3.2 Orthogonal increments for a = 1 

In this section we shall show (in Lemma [3.5|) that when a = 1, the variables Zi,i > 1 
are mutually orthogonal, in the sense of having zero covariances, which will be used 
later on to establish convergence of the (centred) total length of the DLT. To prove 
this, we first need further notation. 

Given Xi, . . . , Xm, let us denote the order statistics of Xi, . . . , Xm, taken in in- 
creasing order, as 
X™^ , X^™^ , . . . , X^-^ . Thus {X'fj^-^ , X^^ , . . . , X^^ ) is a nondecreasing sequence, form- 
ing a permutation of the original (Xi, . . . , X^). Denote the existing m + 1 intervals 
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between points by IJ^ := yXJ^_-^^y XJ^^j for j = 1,2, ...,m + 1, where we set 

1. Let the widths of these intervals (the spacings) be 



X- := and X™ : 



^(i) 



^0-1)' 



for 1 < j < m + 1. Then < 5f < 1 for 1 < j < ?n + 1, and Xl^i Sf = 1. That 
is, the vector [Sf^, 82^, . . . , SJ^_^_l) belongs to the m-dimensional simplex, Am- Note 
that only m of the 5™ are required to specify the vector. 

We can arrange the spacings themselves (5"^, 1 < j < m + 1) into increasing 
order to give S^^-^, 5^2)) • • • 1 ^J^+i)- Then let denote the sigma field generated 
by these ordered spacings, so that 



> '-'1 



(m+1) 



(34) 



The following interpretation of may be helpful. The set (0, 1) \ {-'^i, . . . , X^} 
consists almost surely of m + 1 connected components ('fragments') of total length 
1, and J-g^ is the o"-field generated by the collection of lengths of these fragments, 
ignoring the order in which they appear. 



By definition, the value of Zm must be one of the (ordered) spacings S^-^ , . . . ,3'^ 



(rn+l)- 



The next result says that, given the values of these spacings, each of the possible 
values for Z,„ are equally likely. 



Lemma 3.3 For m > 1 we have 



P 



Zm, — S. 



1 



m+1 



a.s., for i = 1, . . . , m + 1. 



(35) 



Hence, 



, m+l ^ 

m + l^^*^ m + l 
1=1 



(36) 



Proof. First we note that ( , • • • , ^'(i^^) ) is uniformly distributed over 



Now 



{(^1 



^2 



( 1 



: < Xi < X2 < . . . < Xm < 1} • 





1 

-1 1 



V 



\ 




( ^a) \ 







^(2) 







^(3) 



The m by m matrix here has determinant 1. Hence (S"™, . . . , 5"™) is uniform over 



(xi, . . .,Xm) : < l;xj > 0, V 1 < J < 



m 
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Then (^J", . . . , S^_^_l^ is uniform over the m-dimensional simplex A„i. In particular, 
the SJ^ are exchangeable. Thus given S*^^ , . . . , 5^^-^^ , i.e. J^g^, the actual values 
of . . . , S^_^_i are equally likely to be any permutation of SJ^y . . . , 5'^_|_-^p and 
given . . . , Sl^_^_l the value of Zm is equally likely to be any of . . . , 5™ (but 



cannot be 5^+1 )• 



Hence, given SJ^y . . . , SJ^^^-^ the probability that = is (1/m) x m/(m + 
1) = l/(m + l), i.e. we have (|.35j) . and then l|36|) follows since X/J=i^ "^(j) — ' 

Lemma 3.4 Let 1 < m < i. Given T'^ , Z^ and Zm are conditionally independent. 



Proof. Given JF^, we have S*™^, . . . ,5"^^-^^, and by the (conditional) distri- 
bution of Zm is uniform on {SJ^y . . . , -S*"^,,.]^^}. The conditional distribution of Zg, 
£ > m, given depends only on S^y . . . , 'S'(^+i) and not which one of them Z^ 
happens to be. Hence Z^ and Zg are conditionally independent. □ 

Lemma 3.5 For 1 < m < i, the random variables Zm, Zg satisfy Gov [Zm, Zg] = 0. 

Proof. From Lemmas liA\ and 

E [ZmZg\Tg''] = E \Zm\^s \ ^ [^i\^S^] = ^j^-y ^ [^i\^S^] : 

and by taking expectations we obtain 

E[ZmZg] = -^E[Ze] = ^—.-^ = E[Zm]-E[Ze]. 
m + 1 771+1 £+1 

Hence the covariance of Zm and Zg is zero. □ 

Remarks, (i) Galculations yield, for example, that E[D^{Ui)] = E[Zi] = 1/2, 
E[D'^{U^)] = 5/6, and Var[Zi] = 1/12, Var[Z2] = 1/18, \aT[D^{U^)] = 5/36. 

(ii) The orthogonality structure of the Zm is unique to the a = 1 case. For 
example, it can be shown that, for a > 0, 

E[ZmZi] = , „Xo , and E[ZfZ^] = — ^ f 1 + + 



;i + a)2(2 + a)' y ^ 2[l + aY\ r(2a + 3) 

Then 

= (■»-2)r(2« + 3) + 2(. + 2)r(. + 2)^ 

^ ^ 2^ 2(a + l)2(a + 2)r(2a + 3) 

and this quantity is zero only if a = 1; it is positive for a > 1 and negative for 
< a < 1. 
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3.3 Limit behaviour for a > 1 



We now consider the limit distribution of the total weight of the DLT and DLF. 
In the present section we consider the case of a-power weighted edges with a > 1; 
that is, we prove part (ii) of Theorem 11 To describe the moments of the limiting 
distribution of and D°'{Um)i we introduce the notation 

J{a) := f n"(l - n)"dn = 2-1-2^^ JX^+IL. (37) 
Jo r(a + 3/2) 

We start with the rooted case {D'^iJA^)), and subsequently consider the unrooted 
case (D'^iUm)). 

Proposition 3.3 Let a > 1. Then there exists a random variable such that as 
m ^ CO we have D°'{U^) Da almost surely and in . Also, the random variable 
Da satisfies the distributional fixed-point equality Further, E[Da] = l/(a — 1) 

and 

V rn 1 «(« -2 + 2(2a + l) J(a)) 
= (a-l)2(2a-l) • 

Proof. Let Zj be the length of the ith edge of the DLT, as defined at (|17() . Let 
Da '■= Yli^i ^i'- The sum converges almost surely since it has non- negative terms 
and, by ((^ . has finite expectation for a > 1. By (pn|l and Cauchy-Schwarz, there 
exists a constant < C < 00 such that 

00 00 00 00 

i=l j = l i=l j = l 

since a > 1. The convergence then follows from the dominated convergence 
theorem. 

Taking U = Xi here, by the self-similarity of the DLT we have that 

) ^ U'^D^^iU'^) + (1 - urDf,^{U^^,^^) + (39) 

where ~ Bin(m— 1, U), given U, and, given U and N, Z)|-^|(Z^^) and -C|*2j(Z^^_i„jv) 
are independent with the distribution of D^iU'^) and -D"(^^_i_jv)i respectively. 
As m — > 00, and m — N both tend to infinity almost surely, and so, by taking 
m — > 00 in H39() . we obtain the fixed-point equation ((T^ . 

The identity E\Da] = (a — 1)~^ is obtained either from 1)24(1 of Proposition 13. 11 
or by taking expectations in (|T^ . Next, if we set Da = Da — E[Da], (fT^ yields 
Then, using the definition (|37|) of J(a), the fact that E[Da] = 0, and independence, 
we obtain from that 

Mn2i 2i;[^^] g^ + l 2aJ(a) 1 



2a + I (a-l)2(2a-M) (a - 1)2 (a -1)2^ 
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and rearranging this gives H38() . □ 



Recall from Lemma 13.21 that Cq is the limiting weight of edges attached to the 
origin in the DLT on uniform points. Combining this fact with Proposition 13.31 we 
obtain a similar result to the latter for the unrooted follows: 

Proposition 3.4 Let a > 1. There is a random variable F^, satisfying the dis- 
tributional fixed-point equality Iil6\) . such that D°^{Urn) Fa; as n ^ oo, almost 
surely and in L? . Further, E[Fa\ = l/{a{a — 1)), and 

1 1 r.. 1 a + 2(2a + l)J(a) -2 
where J (a) is given by \&7^ and Ya,T:[Da] by 

Proof. By Lemma 13.21 and Proposition 13.31 there are random variables and 

jCq such that as m — > oo we have D'^{l/(^) — > and jC,q{UI^) — > Cq, also with 
almost sure convergence in both cases. Hence, setting ■= Da — >Cq, we have by 
(1771) that 

D''{l(m) = D"{U°J-C^{l(^J^Fa, a.s. and in L^. (41) 

Next, we show that F^ satisfies the distributional fixed-point equality ()16() . The 
self-similarity of the DLT implies that 

D'^iUm) = U'^D^iUN) + (1 - UrD^{Ul_^_^), (42) 

where N ^ Bin(?n— 1, U), given U , and D°'{1/(n) and -D"(^^_i_^) are independent, 
given U and N. As m — > oo, and m — N both tend to infinity almost surely, so 
taking m ^ oo in ()42() . using ProDOsition l3.3l and eqn H41|) . we obtain the fixed-point 
equation (jTO)) . 

The identity E[Fcy\ = a~^{a — is obtained either by (|33() . or by taking 
expectations in (jlbf) and using the formula for E[Dct] in Proposition 1231 Then with 
Fa := Fa — E[Fa], we obtain ((TJ from (fTO)) . and using independence and the fact 
that E[Fa] = E[Da] = we obtain 

2« prp2i E[Dl] , 2aJ(«) - 1 , + 1 



2a -Fl ^ 2a -hi a2(a - 1)2 a2(a - l)2(2a 1) ' 

which yields □ 

Examples. When a = 2 we have that -B[-D2] = 1 and J(2) = 1/30, so that 
Var[i:»2] = 2/9. Also, £^[^2] = 1/2 and Var[F2] = 7/72 w 0.0972. 
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3.4 Limit behaviour for a = 1 



Unlike in the case a > 1, for a = 1 the mean of the total weight D^iU^^) diverges 
as m ^ CO (see Proposition 13. If) , so clearly there is no limiting distribution for 
D^{U^). Nevertheless, by using the orthogonality of the increments of the sequence 
{0^(14^)^7X1 > 1), we are able to show that the centred total weight D^iU^) does 
converge in distribution (in fact, in L^) to a limiting random variable, and likewise 
for the unrooted case; this is our next result. 

Subsequently, we shall characterize the distribution of the limiting random vari- 
able (for both the rooted and unrooted cases) by a fixed-point identity, and thereby 
complete the proof of Theorem 13.11 (i). 

Proposition 3.5 (i) As m — > oo, the random variable D^{U^) converges in 
to a limiting random variable Di, with E[Di] = and Var[L>i] = 2 — vr^/G. In 
particular, Var [D^{l{^J] -^2- ir'^/Q as m ^ oo. 

(a) As m oo, D^{Um) converges in Lp' to the limiting random variable Fi := 

^1 



Cl + l. 



Proof. Adopt the convention D^(Uq) = 0. By the orthogonality of the Zj (Lemma 
EH) and (0, for < £ < m, 



Var 



Var ^ (Z, - E[Z,]) 
j=e+i 

m 

J 



E 

j=e+i 



(j + l)2(j + 2) 



as m, i oo. 



Hence Di{l/(l^) is a Cauchy sequence in L^, and so converges in to a limiting 
random variable, which we denote Di. Then = limm^oo E[Di(Ul^)] = 0, and 



E 



Var[Z)i] 
2 2 



lim Var 

m— +00 



oo 



>(i + l)2(j+2) 



3 + 1 j + 2 



pi (i + 1)' 



vr 



It remains to prove part (ii), the convergence for the centred total length of the 
DLF b^{Um)- We have by (EU that 



D\u^) = b\ul)-cl{K) + E[cl{ul)]^b,-cl + i, 

where the convergence follows by Lemma 13.21 and part (i). Thus D^{Um) converges 
in as m ^ oo. □ 



For the next few results it is more convenient to consider the DLF defined on 
a Poisson number of points. Let {Xi,X2, ■ ■ ■) be a sequence of independent uni- 
formly distributed random variables in (0, 1], and let (N{t),t > 0) be the counting 
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process of a homogeneous Poisson process of unit rate in (0, oo), independent of 
{Xi,X2, ■ ■ ■)■ Thus N{t) is a Poisson variable with parameter t. As before, let 
Um = (Xi, . . . ,Xm), and (for this section only) let Vt ■= UN{t)- L^t := so 
that7'l' = (0,Xi,X2,...,X^(t)). 

We construct the DLF and DLT on Xi,X2, . . .,XN^t) as before. Let D^{V^) = 
D^{V^) - E [D^{V^)] and ID^iVt) = D^{Vt) - E [D^<(Pt)]. We aim to show that 
the limit distribution for D^{V^) is the same as for D^iU^, and likewise in the 
unrooted case. We shall need the following result. 



Lemma 3.6 As t ^ oo, 



^^E[DHVt)] = -^+o{r'y, and ^^e[d\v^)] = + o{r') 



(43) 



Proof. The point set {^i, . . . is a homogeneous Poisson point process in 

(0, 1), so we have 



dt 



E[D\Vt)] 



[length of new arrival] 

/ dii£'[dist. to next pt. to the left of u in Vt] 
Jo 



1 2 e~' 
du I ste"*Ms = - + — (e~* - l) H 

Q t t t 



Similarly, 

_d 
di 



E[D\V?) 



/ dn-E[dist. to next pt. to the left of uinVtU {0}] 
Jo 

du P[dist. to next pt. to the left > s]ds 
Jo Jo 







du I e-*Ms 



1 e- 
- + — 







Lemma 3.7 (i) As t ^ oo, D^{V^) converges in distribution to Di, the large-m 
limit ofD^iU^). 

(ii) As t ^ oo, D^iVt) converges in distribution to Fi, the large-m limit of 

Proof, (i) From Proposition 13.51 we have D^iU^) Di as m — > oo. Let 

at := E[D\V^t)] and ^Lm := E[D\Ul)]. Since /i^ = EYZi^i = E^i(l + ^)"' 
by (|19|) . for any positive integers l,m we have 



lA'm — l^i\ 



max{-m,£) 

E 

j=min(m/)+l 



1 



i + 1 



< log 



max(m, t) + 1 
min(m, t) + 1 



log 



TTT, 



(44) 
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Note the distributional equalities 

C{D\vnN{t)=m)=C{D\ul))- 
C {D\V^) - tiN{t)\N{t) =m)=C (d\U^) 



(45) 



First we aim to show that Oi — — > as t — > oo. Set Pm{t) 
we can write 



e-%_. Then 



m=0 



= X] Pni{t){l^m - f^lt]) + X] Pni{t)il^m - f^lt})- (46) 

|m-[tJ|<t3/4 |m-[tJ|>t3/4 

We examine these two sums separately. First consider the sum for \m- [tJI < t^/^. 
By (HH), we have 

= O f t-^/^\ ^ as t ^ oo. 



Hence the first sum in (|4()j) tends to zero as t ^ oo. To estimate the second sum, 
observe that 

\m-lt\\>t^/* |m-[iJ|>i3/4 

= E[{N{t)+t)l{\N{t)-[t\\>ty'}\ 

< [e [{N{t) + tf] ■ P [|iV(t) -[t\\> ) ^^^(47) 

By Chernoff' bounds on the tail probabilities of a Poisson random variable (e.g. Lemma 
1.4 of JU), the expression (|47() is 0(t exp(— i^/18)) and so tends to zero. Hence the 
second sum in (|4()j) tends to zero, and thus 

— ^^[t\ ^ as t — > oo. (48) 
Now we show that D^{Vf) Di as t ^ oo. We have 

b\v^t) = [D\v^t) - fiNit)) + imt) - m) + im - • (^9) 

The final bracket tends to zero, by (jlHl). Also, by (gSl) and the fact that N{t) 
oo a.s. as t — > oo, we have 



V 
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Finally, using ()44[) . we have 



< 



log 



W+1 



0, 



as t ^ oo, since N{t)/[t\ — > 1. So Slutsky's theorem applied to (|l9|) yields 

D^iJ^i) -Di as t — > oo, completing the proof of (i) 

The proof of (ii) follows in the same way as that of (i), except that in (|44|1 the 
first equals sign is replaced by an inequality <. This does not affect the rest of the 
proof. □ 

The next two propositions complete the proof of Theorem 13.11 

Proposition 3.6 The limiting random variable Di of Provosition (i) satisfies 
the fixed-point equation 

Proof. For integer n > 0, let T„ := min{s : N{s) > n}, the nth arrival time of the 
Poisson process with counting process A^(-). Set T := Ti, and set U := Xi (which 
is uniform on (0, 1)). 

By the Marking Theorem for Poisson processes the two-dimensional point 
process Q := {(X„,T„) : n > 1} is a homogeneous Poisson process of unit intensity 
on (0, 1) X (0, oo). Given the value of {U, T), the restriction of Q to (0, U] x (T, oo) 
and the restriction of Q to {U, 1] x (T, oo) are independent homogeneous Poisson 
processes on these regions. Hence, by scaling properties of the Poisson process 
(see the Mapping Theorem in ^U]) and of the DLT, writing D|,.|(-), i = 1,2 for 

independent copies of D^{-), we have 

D\V?) ^ ([/i^Ji}(pO(,_y)) + (l-^)I)}2}(P(Vc/)(i-T)) + f^)l{i>n- (5^^ 

Let = for s < 0, and a, = E[D^{V^^)] for s > 0. Then D^{V?) = D^{V^) - at, 
so that by 



D\V?) ^ (f/^Ji}(T'°(,_r)) + (l-[/)^}2}(^(Vc/)(i-T)) + f^)l{i>n 

+U {au(t-T) - at) +{l-U) {a(^i_u)(t-T) - at) ■ (51) 

From Lemma 13.61 we have ^ = j + 0(t~^). Hence, if T < t, then 

,.t 1 

au(t-T) = ^ds = \ogt- \og{U{t -T)} + {{U{t - T))'^) , 



at 

and hence as i — > oo, 

at - aui^t-T) - log U, a.s.. (52) 

Since P[T < t] tends to 1, by making t — > oo in l\bl^ and using Slutsky's theorem 
we obtain 0. □ 
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Proposition 3.7 The limiting random variable Fi of Proposition (ii) satis- 
fies the fixed-point equation 0), and so has the same distribution as Di. Also, 
Cov(Fi,i)i) = (7/4) -^76. 

Proof. The proof follows similar lines to that of Proposition 13.61 Once more let 
a, = E[D'^{V^)], for s > 0, and a, = for s < 0. Let 6, = E[D'^{Vs)] for s > 0, and 
bs = for s< 0, and let T := min{t : N{t) > 1}, Then 

D\Vt) = [uDly{Vu(t^T)) + {l-U)Dly{V'l,^^)^,^T-^))l{t>T}, (53) 

where D^-^y{-) and D^2}('^ independent copies of D^{-). Then D^{Vt) = D^{'Pt) — 
bt and Z)^^?) = D^i'Pt) - "-t, so that ^ yields 

D\Vt) ^ (c/^}i}(7'a{t-T)) + (l-C^)^{2}(^?i-f/){t-T)))l{*>n 

+U {bu^t-T) - bt) + il-U) {a^i_u){t-T) - bt) . (54) 
From Lemma 13.61 we have ^ = 7 + 0(t~7. Hence, by the same argument as used 

at (ins), 

bt - bu{t~T) ^ - log [/ a.s. 

Also, at-bt = E[Cl{V^)] by (jHI, so that limt^oo(at - bt) = 1, by Lemma lOl and 
the fact that E[C\] = 1 (eqn (^U)) ). Using also ((^ we find that as t ^ oo, 

a(i_U)(t_T) -bt = {a(i^u){t~T) - at) + {at - h) ^ 1 + log (l-U), a.s. 
Taking t — > oo in (|54j) . and using Slutsky's theorem, we obtain 

Fi = ^7A + (l-t/)^i + f/logf/ + (l-t/)log(l-C/) + (l-[/). (55) 

The change of variable {1 — U) ^ U then shows that Di as defined at © satisfies 
H55p . and so by the uniqueness of solution, Fi has the same distribution as Di and 
satisfies ©. 

To obtain the covariance of Fi and Di, observe from Proposition 13.51 (ii) that 
£q = Z)i — Fi + 1, and therefore by (|29|) . we have that 

1/2 = Var[4] = Var[l)i] + Var[Fi] - 2Cov(Z)i, Fi). (56) 

Since Var[Fi] = Var[Z)i] = 2 — 7r^/6 by Proposition 13. 51 fi). rearranging we find 
that Cov(Z)i,Fi) = (7/4) - tt^/Q. □ 

Remark. Figure 121 is a plot of the estimated probability density function of Di. 
This was obtained by performing 10^ repeated simulations of the DLT on a se- 
quence of 10^ uniform (simulated) random points on (0,1]. For each simulation, 

the expected value of F'H^ioa) (which is precisely (1/2) + (1/3) H (1/1001) by 

Lemma l3.1|) was subtracted from the total length of the simulated DLT to give an 
approximate realization of Di. The density function was then estimated from the 
sample of 10^ approximate realizations of Di, using a window width of 0.0025. The 
simulated sample from which the density estimate for Di was taken had sample 
mean ~ —2 x 10~^ and sample variance ~ 0.3543, which are reasonably close to the 
expectation and variance of Di. 
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Figure 2: Estimated probability density function for Di. 

4 General results in geometric probability 

Notions of stabilizing functionals of point sets have recently proved to be a useful 
basis for a general methodology for establishing limit theorems for functionals of 
random point sets in R'^. In particular, Penrose and Yukich |17M18j provide general 
central limit theorems and laws of large numbers for stabilizing functionals. One 
might hope to apply these results in the case of the MDSF weight. In fact we shall 
obtain our law of large numbers (Theorem 12. 1() by application of a result from |18j . 
but to obtain the central limit theorem for edges away from the boundary in the 
MDSF and MDST, we need an extension of the general result in J7j- It is these 
general results that we describe in the present section. 

For our general results, we use the following notation. Let d > 1 be an integer. 
For X C R*^, constant a > 0, and y G R"^, let y + aX denote the transformed set 
{y + ax : X G X}. Let diam(.Y) := sup{||xi — X2II : xi,X2 G X}, and let card(^) 
denote the cardinality (number of elements) of X (when finite). 

For x G R'^ and r > 0, let i?(x;r) denote the closed Euclidean ball with centre 
X and radius r, and let Q{x;r) denote the corresponding /qo ball, i.e., the d-cube 
x+ [— r, r]"^. For bounded measurable R C R"^ let \R\ denote the Lebesgue measure 
of R, let dR denote the topological boundary of R and for r > 0, set drR := 
Uxe9i?Q(x; r), the r-neighbourhood of the boundary of R. 

4.1 A general law of large numbers 

Let ^(x;^) be a measurable R+-valued function defined for all pairs {x,X), where 
X C R'^ is finite and x G Assume ^ is translation invariant, that is, for all 
y G R^ e(y + x;y + ;f) = C{^;X). When x ^ X, we abbreviate the notation 
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a^;XU{^}) to^(x;;f). 

For our general law of large numbers, we use a notion of stabilization defined as 
follows. For any locally finite point set X C R"' and any £ G N define 



e) := sup ess sup {^(0; {X n 5(0; £)) U A} ] , and 
ifceN y i,k J 

C(X-e) := inf f essinf{^(0;(A'n5(0;^)) U^} ) ; 

fceN \ e,k J 

where ess sup^ is the essential supremum, with respect to Lebesgue measure on 
pj^dfc^ over sets A C Il'^\B{0; £) of cardinality k. Define the limit of ^ on A" by 

CooW :=limsupe+(;i^;fc). 

fc— >oo 

We say the functional ^ stabilizes on X if 

lim t{X; k) = lim ^ k) = i^X). (57) 

fc— >oo k—foc 

For r E (0, oo), let Tir be a homogeneous Poisson process of intensity r on R'^. 
The following general law of large numbers is due to Penrose and Yukich ^Hl ■ We 
shall use it to prove Theorem 12.11 

Lemma 4.1 11^ Suppose q = I or q = 2. Suppose ^ is almost surely stabilizing on 
7ir, with limit £,oo{T~(-t), for all t £ (0, oo). Let f be a probability density function on 
IV^, and let Xn be the point process consisting of n independent random d-vectors 
with common density f. If ^ satisfies the moments condition 



sup E 

neN 



^(n^/'^Xi;n^/'^XnY] <oo, (58) 



for some p > q, then as n ^ oo, 

n-' Yl e(n'/'x;nV'^^„) ^ / ^[^00 (W^(,))] /(x)dx, (59) 

and the limit is finite. 



4.2 General central limit theorems 

In the course of the proof of Theorem 12.21 we shall use a modified form of a general 
central limit theorem obtained for functionals of geometric graphs by Penrose and 
Yukich We recall the setup of ^Zj. As in Section ETTl let ^(x; X) be a translation 
invariant real- valued functional defined for finite X C R'' and ^ £ X. Then ^ induces 
a translation invariant functional H(X; S) defined on all finite point sets X C R'^ 
and all Borel-measurable regions S C R'^ by 

H{X;S):= J2 ^(^;'^)- (60) 

xeATlS 
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It is this 'restricted' functional that interests us here, while J7] is concerned rather 
with the global functional H{X;'R'^). In our particular application (the length of 
edges of the MDST on random points in a square), the global functional fails to 
satisfy the conditions of the central limit theorems in owing to boundary effects. 
Here we generalize the result in to the 'restricted' functional H{X; S). It is this 
generalized result that we can apply to the MDST, when we take S to be a region 
'away from the boundary' of the square in which the random points are placed. 

We use a notion of stabilization for H which is related to, but not equivalent to, 
the notion of stabilization of ^ used in Section [4. 11 Loosely speaking, ^ is stabilizing 
if when a point inserted at the origin into a homogeneous Poisson process, only 
nearby Poisson points affect the inserted point; for H to be stabilizing we require 
also that the the inserted point affects only nearby points. 

For B C R'^', let A(^;B) denote the 'add one cost' of the functional H on the 
insertion of a point at the origin, 

A(Af; B) := H{X U {0}; B) - H{X- B). 

Let V := Til (a homogeneous Poisson point process of unit intensity on R'^). Let 
Qn := VnRn (the restriction of V to Adapting the ideas of J7j) 'we make the 
following definitions. 

Definition 4.1 We say the functional H is strongly stabilizing if there exist almost 
surely finite random variables R (a radius of stabilization^ and A(cx)) such that, with 
probability 1, for any B 5 i?(0; R), 

A{V n -B(0; R) UA;B) = A{oo), V finite .A C R'^ \ B{0; R). 

We say that the functional H is polynomially bounded if, for all B B 0, there 
exists a constant /3 such that for all finite sets X C R'^, 

\H{X; B)\<f3 (diam(A:') + cay:d{X)f . (61) 

We say that H is homogeneous of order 7 if for all finite X C R'^ and Borel 
B C R'^, and ah a G R, H{aX- aB) = a^H{X; B). 

Let {Rn, Sn), for n = 1, 2, . . ., be a sequence of ordered pairs of bounded Borel 
subsets of R'^, such that Sn C i?„ for all n. Assume that for all r > 0, n~^\drRn\ — ^ 
and n~^\drSn\ (the vanishing relative boundary condition). Assume also that 
\Rn\ = n for all n, and \Sn\/n ^ 1 as n — > 00; that 5^ tends to R"^, in the sense that 
Un>i nm>n Sm = R'^; and that there exists a constant (5 such that diam(i?„) < Pn^^ 
for all n (the polynomial boundedness condition on {Rn, Sn)n>i)- Subject to these 
conditions, the choice of {Rn, Sn)n>i is arbitrary. 

Let Ui^„,U2,n) ... be i.i.d. uniform random vectors on Let 

^m,n — {Ui j^, . . . , Um^n} 

(a binomial point process), and for Borel A C R"^ with < |^| < 00, let h(m,A be 
the binomial point process of m i.i.d. uniform random vectors on A. 
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Let TZ be the collection of all pairs {A, B) with A,B C^"^ of the form {A, B) = 
(x + i?ri,x + Sn) with X G R"^ and n G N. That is, TZ is the collection of all the 
(i2„, Sn) and their translates. 

We say that the functional H satisfies the uniform hounded moments condition 
on TZ if 

sup I sup {E[^{Um,A-.BY]]\ (62) 

(A,S)e7^:0eA \|A|/2<m<3|A|/2 J 

We now state the general results, which extend those of Penrose and Yukich 
(Theorem 2.1 and Corollary 2.1 in |17j). 

Theorem 4.1 Suppose that H is strongly stabilizing, is polynomially bounded I161\) . 
and satisfies the uniform bounded moments condition \6^) on TZ. Then there exist 
constants s^, t^ , with < t^ < s'^, such that as n oo, 

(i) n-iVar(F(Q„;5„)) 

(ii) n-^/^H{Qn;Sn)-E[HiQn;Sn)])^Ar{0,s^); 
(ill) n-^Var [H (Z^„,„; Sn)) ^ t^ ; 

(iv) n~^l^ {H {Un,n] Sn) - E [H {Un,n] Sn)]) ^ M{'d, t^) . 

Also, and t'^ are independent of the choice of the {Rn,Sn)- Further, if the distri- 
bution o/A(oo) is nondegenerate, then s"^ >t^ > 0. 

Let i?o be a fixed bounded Borel subset of R"^ with |i?o| = 1 and |9-Ro| = 0. Let 
{So^n,n > 1) be a sequence of Borel sets with 5o,n ^ Ro such that |S'o,n| ^ 1 as 
n — > cxD and for all r > we have \d^-i/d^So^n\ — > as n — > oo 

Let TZq be the collection of all pairs of the form (x + u^^'^Rq, x + n^^'^So^n) with 
n > 1 and x G R*^. Let Xn be the binomial point process of n i.i.d. uniform random 
vectors on Rq, and let Vn be a homogeneous Poisson point process of intensity n on 
-Ro- 

Corollary 4.1 Suppose H is strongly stabilizing, satisfies the uniform bounded mo- 
ments condition on TZq, is polynomially bounded and is homogeneous of order 7. 
Then with , t^ as in Theorem we have that, as n —>■ 00 

(z) n(27M-iVar(F(P„;So,„))-s2. 

(11) n(^/'')-yHH{Vn;So,n)-E[H{Vn;So,n)])^Af{0,s^); 
(ill) n(27M-iVar {H {Xn, 5o,n)) ^ t^; 

(IV) n(^/'^)-l/2 {Xn, So,n) - E [H {Xn, So,n)]) ^ M (O, t") . 

Proof. The corollary follows from Theorem I4.1l bv taking i?„ = u^^'^Rq and Sn = 
n^^^So^n (or suitable translates thereof), and scaling, since H is homogeneous of 
order 7. □ 
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4.3 Proof of Theorem I4.lt the Poisson case 

Let V he a Poisson process of unit intensity on R*^. We say the functional H is 
weakly stabilizing on TZ if there is a random variable A(oo) such that 

A{VnA;B) ^ A{oo), (63) 

as {A, B) R'^ through IZ, by which we mean H63() holds whenever (j4, B) is an 
7^-valued sequence of the form {An,Bn)n>i, such that Un>i r\m>n Bm = R*^- Note 
that strong stabilization of H implies weak stabilization of H. 

We say the functional H satisfies the Poisson bounded moments condition on TZ 

if 

sup {E[A{VnA;B)^]} <oo. (64) 
{A,B)e7^:0eA 

Theorem 4.2 Suppose that H is weakly stabilizing on TZ / 1 6'^) and satisfies ( |6'^| ). 
Then there exists s^ > such that as n ^ oo, n~^Var[ff(Q.„; S^)] — > and 
n-i/2(^(g„. s^) - E[H{Qn; Sn)]) ^ AA(0, s^). 

Before proving Theorem l4.21 we require further definitions and a lemma. Let V' be 
an independent copy of the Poisson process V. For x G Z"^, set 

P"(x) = {V \ Q(x; 1/2)) U {V n Q(x; 1/2)) . 

Then given a translation invariant functional H on point sets in R*^, define 

Ax(A; S) := H{V"{x) r\A;B)- H{V n A; B); 

this is the change in H{VnA; B) when the Poisson points in (5(x; 1 /2) are resampled. 

Lemma 4.2 Suppose H is weakly stabilizing on TZ. Then for all x G Z'^, i/iere is a 
random variable Ax(oo) such that for all x G Z*^, 

Ax(A;i?) ^ A,,(oo), (65) 

as {A, B) —>■ IV^ through TZ. Moreover, if H satisfies i64\ l, then 

sup E[{A^{A;B))^] <oo. (66) 

(A,B)G7^,xeZd 

Proof. Set Co = (5(0; 1/2). By translation invariance, we need only consider the 
case X = 0, and thus it suffices to prove that the variables H{V f] A;B) — H{V f] 
A \ Co; B) converge almost surely as {A, B) R'^ through TZ. 

The number of points of "P in Co is Poisson with parameter 1. Let Vi , V2, . . . , V^v 
be the points of P R Co, taken in an order chosen uniformly at random from the A^! 
possibilities. Then, provided Cq C A., 

N-l 

H{V nA;B)-H{rnA\ Co; B) = ^M'^ B), 

i=0 
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where 



6i{A;B) ■.= H{{VnA\Co)U{Vi,...,V,+i};B)-Hi{VnA\Co)U{Vi,...,Vi};B). 

Since is a.s. finite, it suffices to prove that each 6i{A; B) converges almost surely as 
{A, B) — > R*^ through TZ. Let U be a uniform random vector on Co, independent of 
v. The distribution of the translated point process — Vj+i + { Vi , . . . , Vj} U \ Co) 
is the same as the conditional distribution of V given that the number of points in 
— U + Cq is equal to i, an event of strictly positive probability. By assumption, this 
satisfies weak stabilization, which proves (|65|) . 

Next we prove ()56() . If 1/2) = then Ax(^; B) is zero with probability 
1. By translation invariance, it suffices to consider the x = case, that is, to prove 



sup E 

(A,B)G7e:ConA^0 



(Ao(A;5))^ <oo. (67) 



The proof of this now follows the proof of (3.4) of but with 5i{A) replaced by 
8i{A]B) everywhere. □ 



Proof of Theorem 14. 2L Here we can assume, without loss of generality, that 
Qn = VriRn- For X G Z'^, let .Fx denote the tj- field generated by the points of V in 
UyeZ'*:y<xQ(yi 1/^)) where the order in the union is the lexicographic order on Z*^. 

Let R'n be the set of points x G Z*^ such that (5(x; 1/2) n i?„ / 0. Let kn = 
card(i?^). Then we have that 

RnQ \J Q{x;l/2) ^RnUdiiRn), 

so that 

l^nl <kn < \Rn\ + \di{Rn)\. 

The vanishing relative boundary condition then implies that kn/n ^ 1 as n — > cx). 

Define the filtration {Qo,Qi, . . . ,Qk„) as follows: let Qq be the trivial cr-field, 
label the elements of i?^ in lexicographic order as xi , . . . , x^^^ and let Qi = Ty^^ for 
\<i<kn. Then H{Qn\ Sn) - E[H{Qn] Sn)] = Eti A, where we set 

A = E[H{Qn; Sn)m - E[H{Qn; 5„)|g,_i] = ^[-Ax,(i2„; 5n)|FxJ. (68) 

By orthogonality of martingale differences, \ai[H{Qn; Sn)] = EY^\=iDf. By this 
fact, along with a CLT for martingale differences (Theorem 2.3 of or Theorem 
2.10 of ^U), it suffices to prove the conditions 

sup E 

n>l 



max I k„ \Di\\ 

l<i<kn I J 



< oo. 



k^^^'^ max \Di\ — ^ 0, 

l<i<fc„ 



(69) 
(70) 
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and for some > 0, 



K'Y.Dl^s\ (71) 



i=l 



Using (innj, and the representation (|H5|) for Dj, we can verify and ((7n|) in 
just the same manner as for the equivalent estimates (3.7) and (3.8) in |17j . 

We now prove (dU). By (jini), for each x G Z"' the variables /\^{A;B) converge 
almost surely to a limit, denoted Ax(oo), as {A,B) — > R*^ through TZ. For x G Z*^ 
and (A, B) G 7^, let 

Fx(A; B) = E[^^{A- B)\T^]- = E[^^{<x)\T^]. 

Then (-Fx,x G Z^) is a stationary family of random variables. Set = E[Fq\. We 
claim that the ergodic theorem implies 



E ^ s\ (72) 

xg-R; 



The proof of this follows, with minor modifications, the proof of the corresponding 
result (3.10) in . 

We need to show that Fx{Rn; Sn)"^ approximates to F^. We consider x at the 
origin 0. For any {A, B) G 7^, by Cauchy-Schwarz, 

E[\Fo{A; Bf - F^W < (i?[(Fo(A; B) + Fof]f^ (i^[(Fo(^; B) - Fofjf . (73) 
By the definition of Fq and the conditional Jensen inequality, 

E[{Fo{A;B)+Fof] = E [{E[Ao{A; B) + Ao{^)\ro]y 
< E[E[{Ao{A;B) + Ao{^)f\ro]] 

= E[{Ao{A-B) + Ao{^))\ 

which is uniformly bounded by (|65() and H66|). Similarly, 

E[{FoiA- B) - Fof] < E[{Ao{A; B) - Ao{^)fl (74) 

which is also uniformly bounded by ()65() and H66() . For any 7^- valued sequence 
{An , Bn )n>i with Un>i r\m>n Bn = H-'', the Sequence (Ao(A„; Bn) - Ao(oo))^ tends 
to almost surely by (jHSJ), and is uniformly integrable by H66() . and therefore the 
expression (jTU tends to zero so that by (jTSJ, E[\Fo{An; Bnf - F^\] 0. 

Returning to the given sequence (i?„,5„), let e > 0. By the vanishing relative 
boundary condition, we can choose Kn so that lim„^oo Kn = oo and \dK„Sn\ < 
for all n. Let S'^ be the set of x G Z'^ such that Qi/2{'^) has non-empty intersection 
with Sn \ dKn{Sn)- Using the conclusion of the previous paragraph and translation 
invar iance, it is not hard to deduce that 

lim sw^y E[\F^{Rn;Sn? - Fl\] = Q. (75) 
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Also, since we assume ~ n we have card(5^) > \Sn\ — en > (1 — 2e)n for large 
enough n. Using this with ([75]l. the uniform boundedness of El\Fx{Rn', Sn)'^ — F^\] 
and the fact that e can be taken arbitrarily small in the above argument, it is routine 
to deduce that 

and therefore (f72|) remains true with Fx replaced by Fx(i?n; 5'.«); that is, (|7T|) holds 
and the proof of Theorem 14.21 is complete. □ 

4.4 Proof of Theorem I4.lt the non-Poisson case 

In this section we complete the proof of Theorem 14. II The first step is to show that 
the conditions of Theorem 14. II imply those of Theorem 14. 2( as follows. 

Lemma 4.3 // H satisfies the uniform bounded moments condition i5^) and is 
polynomially hounded, then H satisfies the Poisson bounded moments condition ^64]) - 

Proof. The proof follows, with minor modifications, that of Lemma 4.1 of ^^I- D 

It follows from Lemma [4.3l that if H satisfies the conditions of Theorem l4.H then 
Theorem l4.2l aDDlies and we have the Poisson parts of Theorem l4.1l To de-Poissonize 
these limits we follow ^21- Define 

Rm,n '■= H{Um+l,n] B) — H{Um,n', B). 

We use the following coupling lemma. 

Lemma 4.4 Suppose H is strongly stabilizing. Let e > 0. Then there exists 5 > 
and no > 1 such that for all n > uq and all m,m' £ [(1 — S)n, (1 + 6)n] with m < m' , 
there exists a coupled family of variables D,D',R,R' with the following properties: 

(i) D and D' each have the same distribution as A(oo); 

(a) D and D' are independent; 

(Hi) {R,R') have the 30,7716 joint distvibutiOTl as (^Rm.^n') ^m',ri)j 

(iv) P[{D ^R}U {D' / R'}] < e. 

Proof. Since we assume ISnl/l-R^I — > 1, the probability that a random (i- vector 
uniformly distributed over i?„ lies in Sn tends to 1 as n — > oo. Using this fact the 
proof follows, with some minor modifications, that of the corresponding result in 
[T7| . Lemma 4.2. □ 

Lemma 4.5 Suppose H is strongly stabilizing and satisfies the uniform bounded 
moments condition \6I^) . Let (/j(n))„>i be a sequence with n~^h{n) as n ^ oo. 
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Then 

lim sup \ERra,n- El^{oo)\=Q] (76) 

|n-m|</i(n) 

lim sup \ERm,nRr,z',n-{E/^{^)f\=Q; (77) 

n-h(n)<m<m'<n+h(n) 

lim sup ER^^ <oo. (78) 

\n-m\<h{n) 

Proof. The proof follows that of Lemma 4.3 of J7]. □ 

Proof of Theorem 14. II Theorem 14. II now follows in the same way as Theorem 2.1 
in [HI, replacing H{ ■ ) with H{- ■,Sn)- □ 



5 Proof of Theorem 12.1b Laws of large num- 
bers 

We now derive our law of large numbers for the total weight of the random MDSF 

9,<t> 

on the unit square. We consider the general partial order =^ , for < 9 < In and 

e,ij> 

0<(p<TTov(p = 2-ir. Recall that y =<! x if y G C0,0(x), where C0,(f,{x) is the cone 
formed by the rays at and 9 + (j) measured anticlockwise from the upwards vertical. 

We consider the random point set the binomial point process of n inde- 
pendent uniformly distributed points on (0,1]^. However, the result © also holds 
(with virtually the same proof) if the points of Xn are uniformly distributed on an 
arbitrary convex set in of unit area. If the points are distributed in with a 
density function / that has convex support and is bounded away from and infinity 
on its support, then (jSJ holds with a factor of Jj^a /(x)''^~"-*/^dx introduced into 
the right hand side (cf. eqn (2.9) of ^). 

For the general partial order given by 6,(j) we apply Lemma 14.11 to obtain a 
law of large numbers for C°^{Xn)- As a special case, we thus obtain a law of large 
numbers under the partial order ^* given hy 9 = (j) = tt /2. This method enables 
us to evaluate the limit explicitly, unlike methods based on the subadditivity of the 
functional which may also be applicable here (see the remark at the end of this 
section). 

In applying Lemma l4.1l to the MDSF functional, we take the dimension d in the 
lemma to be 2, and take /(x) (the underlying probability density function in the 
lemma) to be 1 for x S (0, 1]^ and zero elsewhere. We take ^(x;^Y) to be (i(x;Af)", 
where (i(x; X) is the distance from point x to its directed nearest neighbour in X 

s,<t> 

under ^ , if such a neighbour exists, or zero otherwise. Thus in our case 

i(iL-X) = {d{^-X)T with d(x;;f) :=min{||x-y|| :y G A'\{x},y =^x} (79) 

with the convention that min{} = 0. We need to show this choice of ^ satisfies the 
conditions of Lemma |4. II As before, 1-ir denotes a homogeneous Poisson process on 
R"^ of intensity r, now with d = 2. 
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Lemma 5.1 Let r > 0. Then ^ is almost surely stabilizing on 7ir, in the sense of 
\5^, with limit Coo(Wr) = (d(0; W^))". 

Proof. Let R be the (random) distance from to its directed nearest neighbour 
in TLt, i.e. R = d{0;7ir)- Since (p > and r > 0, we have < i? < oo ahnost 
surely. But then for any i > R, we have ^(0; [Hr n fi(0;£)) U ^) = for any 
finite ^ C R'' \ B{0;£). Thus ^ stabihzes on Hr with hmit Coo(Wr) = -R". □ 

Before proving that our choice of ^ satisfies the moments condition for Lemma 
14.11 we give a geometrical lemma. For B C with B bounded, and for x £ B, 
write dist(x;(9S) for sup{r : B{x;r) C B}, and for s > 0, define the region 

Ae,4x, s; B) := B(x; s) D B D Ce^^). (80) 



Lemma 5.2 Let B be a convex bounded set in R^, and let x G i?. If AQ^^i-x., s; B)n 
dB{x; s) / 0, and s > dist(x, dB), then 

\Ae^^{x,s;B)\ > ssm{ct)/2)dist{x,dB)/2. 

Proof. The condition Aq s; B) n 95(x;s) ^ says that there exists y E 
B n Co^^{x,s) with ||y — x|| = s. The line segment xy is contained in the cone 
Cg^tf){x); take a half-line h starting from x, at an angle (^/2 to the line segment xy 
and such that h is also contained in Cg^(f,{x.). Let z be the point in h at a distance 
dist(x, 5-B) from x. Then the interior of the triangle xyz is entirely contained in 
A0^^{x,s), and has area s sin((?l)/2)dist(x, 9i?)/2. □ 



Lemma 5.3 Suppose a > 0. Then £^ given by i79\) satisfies the moments condition 
115^} for any p £ (1/a, 2/a]. 



Proof. Setting i?„ := (0,n^/2]2^ 

we have 



E 



E 



Rn 



^(x;?iV2^„_l) 



dx 



n 



(81) 



For X E Rn set m(x) := dist(x, dRn)- Let us divide Rn into three regions 

Rn{l) := {xeRn-. m(x) < n~^/^}; i?„(2) := {x E : m(x) > 1}; 
RniS) := {x E Rn : n'^/^ < ^(x) < i}. 

For all X E Rn, we have ^{x;n^^'^Xn-i) < (2n)'^/2^ ^nd hence, since i?n(l) has area 
at most 4, we can bound the contribution to (|81() from x E Rn{^) by 



I 



E 



PI dx 



^(x; n'/^Xn-i)) — < 4n-i(2n)W2 = 22+W2„(p«-2)/2^ (32) 



/xG_R„(l) 

which is bounded provided pa < 2. 



n 
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Now, for X £ Rn, with Ag^^{-) defined at (jSOJ, we have 



P 



< P 



n 



n-l 



' ^ _ |^e,0(x,g;-Rn) 
n 

< exp(l-|Ae,<^(x,s;i?„)|), (83) 
since |Ag_(^(x, s; i?„,)| < n. For x G i2„ and s > m(x), by Lemma 15.21 we have 
\Ag^^{x, s; Rn)\ > sin((/)/2)sm(x)/2 if Ag^^{x, s; i?„) n dB{x; s) / 0, 
and also 

P[d{x; n^l'^Xn-x) > s] = if ^e,0(x, s; Rn) n 3S(x; s) = 0. 

For s < m(x), we have that |^e^0(x, s; i?„)| = > sin((/)/2)s^. Combining these 
observations and (|83|) . we obtain for all x G and s > that 



P 



d{x;n^^'^Xn-i) > s < exp (1 — sin((/)/2)smin(s,m(x))/2) , x G i?„ 



Setting c = (1/2) sin(0/2), we therefore have for x G Rn that 



E 



C(x;ni/2^„_i)P 



P 



dr 



< 



P 

m{x)° 



a/(ap) 



C(x;ni/2A'„_i)^>r 
(i(x;ni/^^„_i) > r 
drexp(^l-cr2/("P) 

t-oo . 

+ / drexp (l - cm(x)ri/("P) 



dr 



/>oo 

0(1)+ / ei-'="apu"P-Sn(x)-f°dn 



' m(x)^ 

0(1) +0(m(x)""P 



(84) 



For X G Rn{2), this bound is 0(1), and the area of RnC^) is less than n, so that the 
contribution to (|81|) from RnC^) satisfies 



limsup / E 



^(x;ni/2;f„_i; 



dx 



< CXO. 



(85) 



Finally, by (jSU, there is a constant c' such that if op > 1, the contribution to 1)81^ 
from Rni'i) satisfies 



Rn{3) 



e(x;ni/2;f„_,; 



dx 



?^ ./„=n-l/2 



,/„-l/2 



< 



c n 



ap — 1 



n 



(ap-l)/2 
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which is bounded provided ap < 2. Combined with the bounds in (|82() and 
this shows that the expression (|8H) is uniformly bounded, provided 1 < ap < 2. □ 

Following notation from Section I4.2[ for A; G N, and for a < b and c < d let 
l^k,{a,b]x{c,d] denote the point process consisting of k independent random vectors 
uniformly distributed on the rectangle (a, 6] x {c,d]. Before proceeding further, we 
recall that if M(X) denotes the number of minimal elements (under the ordering 
=4*) of a point set X C'R?, then 

E[M{Uk,ia,bMc,d])] = E[Mm] = 1 + (1/2) + • • • + (l/k) < 1 + log A;. (86) 

The first equality in (|86() comes from some obvious scaling which shows that the 
distribution of M(h(i^^(^a,b]x{c,d]) does not depend on a, b, c, d. For the second equality 
in see [2] or the proof of Theorem 1.1(a) of [HI. 

Proof of Theorem 12. XL Suppose a < 2, and set /(•) to be the indicator of the 
unit square (0, l]^. By Lemmas 15.11 and 15.31 our functional given at H79() . satisfies 
the conditions of Lemma l4. II with p = 2/a and q = 1, with this choice of /. So by 
Lemma l4. 11 we have that 

[ E [Coo(Wy(x))] /(x)dx = E^iHi). (87) 

Since the disk sector Cq^^{x) n B(x;r) has area {(j)/2)r'^, by Lemma l5.ll we have 
P[U{ni)>s] = p[WinC,,^(O)n5(O;si/°) = 0] =exp(-(<^/2)s2/-). 
Hence, the limit in ()87() is 

/•oo 

E [eoo(Hi)] = / P [Coo (Hi) >s]ds = a2(-2)/2,^-/2r(a/2), 
Jo 

_ _ 

and this gives us Finally, in the case where ^=^*, ((2|) remains true when Xn 
is replaced by X^, since 

E[n(°/2)-i|£"(;t'0) - £°(A'„)|] < 2"/2n("/2)-i_E[M(;f„)], (88) 

where M{Xn) denotes the number of minimal elements of Xn. By H86() . ii^[M(^%'„)] < 
1 + log n, and hence the right hand side of (|88|) tends to as n ^ co for < a < 2. 
This gives us with under 4* . O 

Remark. A law of large numbers for Euclidean functionals of many random geo- 
metric structures can be treated by the boundary functional approach of Yukich |^ . 
It can be shown that the MDSF satisfies some, but possibly not all, of the appro- 
priate conditions that would allow this approach to be successful. The MDSF func- 
tional is subadditive, its corresponding boundary functional is super additive, and 
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the functional and its boundary functional are sufficiently 'close in mean'. However, 
it is not clear that the functional is 'smooth', since the degree of the graph is not 
bounded. 

6 Central limit theorem away from the bound- 
ary 

While it should be possible to adapt the argument of the present section to more 
general partial orders, from now on we take the partial order ^ on to be For 
each n, define the region So „ := (n^-V2^ i]2^ where e G (0, 1/2) is a small constant 
to be chosen later. In this section, we use the general central limit theorems of 
Section f4.2l to demonstrate a central limit theorem for the contribution to the total 
weight of the MDSF, under from edges away from the boundary, that is from 
points in the region So.n- 

Given a > 0, consider the MDSF total weight functional H = on point sets 
in R^. For x G ^Y, let the directed nearest neighbour distance d{'K;X) and the 
corresponding a- weighted functional (^(x;X) be given by (|79)1 . where now we take 
=<( to be =<;*. For R C r2, set 

£-(A';i?)= Yl ^(^5-^)' (89) 

and set := £°(Af;R2). 

Let Xn be the binomial point process of n i.i.d. uniform random vectors on (0, 1]^, 
and let Vn be the homogeneous Poisson process of intensity n on (0, 1]^. The main 
result of this section is the following. 

Theorem 6.1 Suppose that a > and the partial order is =4*- Then there exist 
constants < < Sa, not depending on the choice of £, such that, as n ^ oo, 

(i) n"-iVar[£"(A'„;So,„)] 

(ii) n("-i)/2£"(A'„;So,„) ^AA(0,t2); 
(ill) n"-iVar 5o,„)] ^ si; 

(iv) n(-"i)/2£" (P„; So,„) ^ M (0, si) . 

The following corollary states that Theorem 16.11 remains true in the rooted cases 
too, i.e. with replaced by and Vn replaced by V^- 

Corollary 6.1 Suppose that a > and the partial order is ^* . Then, with ta, Sa 
as given in Theorem \6.1\ we have that as n ^ oo, 

(i) n"-iVar [£"(A'0;5o,„)] tl; 

ill) n(-i)/2£" (^0; So,„) ^ M (0, tl) ; 

(in) n^-^YeiT [£" {K, So,n)] ^ 4/ 
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Proof. For each region R C [0, 1]^ and point set S C [0, 1]^ with G cS, let L%[S\ R) 
denote the total weight of the edges incident to in the MDST on S from points 
in R. Then C'iV^; So,n) equals C^iVn, So, n) + jC^iV^; So, n), so that 

Var[£-(pO; So,n)] - Var[£-(P„; So,n)] = 2Cov[£-(P„; 5o,n), So,n)] 

+\av[£^iV^;So,n)]. (90) 

Let Nn denote the number of points of Vn, and let En denote the event that at 
least one point of Vn H 5o,n is joined to in the MDST on Vn- Then 



P[En] < P [(0, n'-^/^f nVn = 9] = exp(-n 

and £.Q{Vn', So^n) ^ 2"/^A''„l£;^. Thus by the Cauchy-Schwarz inequality, for some 
finite constant C we have 

Var So,n)] < E So,nf] < Cn^ exp(-n272), (91) 

and combining this with 1)90^ . Theorem 16. II fiii) and the Cauchy-Schwarz inequality 
shows that 

n--i(Var [£-(pO; 5o,„)] - Var [£"(P„; So,n)]) ^ 0, 

so that from Theorem 15. II fiii) we obtain the corresponding rooted result (iii). Also, 
since (HI} implies n^-^Var [£[J('P°; So,™)] tends to zero, from Theorem 16. II fiv) and 
Slutsky's theorem we obtain the corresponding rooted result (iv). 

The binomial results (i) and (ii) follow in the same manner as above, with slight 
modifications. □ 



To prove Theorem 16. II we demonstrate that our functional satisfies suitable 
versions of the conditions of Theorem 14. II and Corollarv 14.11 First, we see that 
is polynomially bounded (see (|6T|l ). since 

C'^iX-B) < (diam(A'))"card(A'). 

Also, is homogeneous of order a. 

Lemma 6.1 is strongly stabilizing, in the sense of Definition \4.I\ 

Proof. To prove stabilization it is sufficient to show that there exists an almost 
surely finite random variable R, the radius of stabilization, such that the add one 
cost is unaffected by changes in the configuration at a distance greater than R from 
the added point. We show that there exists such an R. 

For s > construct eight disjoint triangles Tj{s),l < j < 8, by splitting the 
square Q{0;s) into eight triangles via drawing in the diagonals of the square and 
the X and y axes. Label the triangle with vertices (0, 0), (0, s), (s, s) as Ti(s) and then 
label increasingly in a clockwise manner. See Figure |21 Note that Tj(t) C Tj{s) 
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(0,s) is,s) 



T7{s) \ 




y 





(s,0) 



Figure 3: The triangles Ti(s), . . . , 38(5), s > 0. 

for t < s. Let the random variable 5 be the minimum s such that the triangles 
Tj{s),l < j < 8, each contain at least one point of V. Then S is almost surely 
finite. 

We claim that R = 3S is a radius of stabilization for C", that is any points at 
distance d > 3S from the origin have no impact on the set of added or removed 
edges when a point is inserted at the origin. 

First, can have no point at a distance of at least 3S away as its directed nearest 
neighbour, since there will be points in T5 and Tg within a distance of at most ^/2S 
of 0. 

We now need to show that no point at a distance at least 35" from can have 
the origin as its directed nearest neighbour. Clearly, for the partial order we 
need only consider points in the region (0, 00)^. 

Consider a point {x,y) in the first quadrant, such that > 35*. Consider 

the disk sector 

D(^,y) ■■= B {{x,y),\\{x,y)\\) n {w: w 4* {x,y)} . 

We aim to show that given any (x,y) of the above form, at least one of the Tj{S), 
j = 1,...,8, is contained in D^^y^, which implies that the origin cannot be the 
directed nearest neighbour of {x,y). To demonstrate this, we show that given such 
an D(^^ y-^ contains all three vertices of at least one of the Tj{S). 

First suppose x > S, y > S. Then we have that Ti{S) and 72 (S*) are in 
since we have, for example, 

||(x,y)-0f-||(x,y)-(0,5)f = {x"" + y^) - {x'' + (y - Sf) 

= S{2y -S)>0. 

By symmetry, the only other situation we need consider is when < x < S". Then 
> 93"^ — x^ > 83"^, so y > 2^/23. Then we have that 78(5") is in Df^^^y-^, since 

\\{x,y)-0f-\\ix,y)-{-3,3)f = {x-^ + y^) - {{x + 3^ + (y - 3)^) 
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2S{y -x-S)>AS 



'2(^/2-1) > 0. 



This completes the proof. □ 

Lemma 6.2 The distribution o/A(oo) is non- degenerate. 

Proof. We demonstrate the existence of two configurations that occur with strictly 
positive probability and give rise to different values for A(cx3). Note that adding a 
point at the origin causes some new edges to be formed (namely those incident to 
the origin), and the possible deletion of some edges (namely the edges from points 
which have the origin as their directed nearest neighbour after its insertion). 

Let ?] > 0, with r/ < 1/3. Later we shall impose further conditions on r]. Again 
we refer to the construction in Figure El Let Ei denote the event that for each i, 
1 < i < 8, there is a single point of V, denoted Wj, in each of Ti{rj), and that there 
are no other points in [—1, 1]^. Suppose that Ei occurs. Then, on addition of the 
origin, the only edges that can possibly be removed are those from Wi and from 
W2 (see the proof of Lemma l6.1() . These removed edges have length at most r?\/8, 
and hence 



Now let E2 denote the event that there is a single point of V, denoted Zi, in 
the square (r/, 2rj) x (0, ry), a single point denoted Z2 in the square (0, rj) x (r^, 2??), a 
single point denoted W in the square (—1 — r/, —1) x (— r/, 0), and no other point in 
[-3,3]2. See Figure H 



A > -2(?7^/8)" := 5_ 



'1, 



on El. 



(92) 



3 



-3 



-1-17 



W 



-1 







V 




3 



-3 



Figure 4: A possible configuration for event E2. 
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Suppose that E2 occurs. Now, on addition of the origin, an edge of length at 
most 1 + 2r/ is added from the origin to W. On the other hand, for i = 1,2 the edge 
from Zj to W (of length at least 1) is replaced by an edge from Zj to the origin (of 
length at most 3r]). It is also possible that some other edges from points outside 
[—3,3]^ are replaced by shorter edges from these points to the origin. Combining 
the effect of all these additions and replacements of edges, we find that 

A < (1 + 2?])" + 2((37/)" - 1) := ^2, on E2. (93) 

Given a, by taking ij small enough we can arrange that 61 > —1/4 and 62 < —3/4. 
With such a choice of ij, events Ei and E2 both have strictly positive probability 
which shows that the distribution of A is non-degenerate. □ 

For the next lemma, we set Rq := (0,1]^, recalling that So^n '■= (n^~^/^, 1]^ 
throughout this section, and let TZq be as defined just before Corollarv 14.11 

Lemma 6.3 C°' satisfies the uniform bounded moments condition h6l3\) on TZq. 

Proof. Choose some {A,B) £ TZq such that £ A, i.e., such that for some n G N 
the set j4 is a translate of (0, n^/^]^ containing the origin and B is the corresponding 
translate of n^^'^So^n = (n'^,n^/^]^. Note that |^| = n, and choose m € [n/2,3n/2]. 

Denote the m independent random vectors on A comprising i^^, A by Vi, . . . , V^- 
For contributions to A{Urn,A',B) we are only interested in edges from points in the 
region B away from the boundary of A, although the origin can be inserted anywhere 
in A. Contributions to A{Urn,A',B) come from the edges that are added or deleted 
on the addition of 0. We split A(Um,A] B) into two parts: the positive contribution 
from added edges, A+(Z//m,A; -B), and the negative contribution, A~(i^m,A; B), from 
removed edges. 

By construction of the MDSF, the added edges are those that have as an 
end-point after it has been inserted. Thus an upper bound on A'^ {Urn,A'-, B) is 
-^max'^(O) + -^0 ) where Lmax is the length of the longest edge from a point oiUm,A^B 
to 0, and (5(0) is the number of such edges (or zero if no such edge exists), and Lq 
is the length of the edge from 0, or zero if no such edge exists. 

For w G A and x € B, with w ^* x, define the region 

i?(w,x) := {y G ^ : y ^* x, ||y - x|| < ||w - x||}. 

Since points in B are distant at least 1 from the lower or left boundary of A, by 
Lemma Is. 21 there exists a constant < C < 00 such that 

|-R(w,x)| > C||x - w||, for all v^^ G ^4, x G i? with w x and ||x - w|| > 1. (94) 

Suppose there is a point at x with x. Then, the probability of the event E{x) 
that X is joined to the origin in the MDSF on Um^A U {0} is 



P[E{x)] = P[i2(0,x) empty] = 1 



|i?(0,x)'^™-^ 



1^1 

< exp((l-m)( '^^"'''^' )) <exp(l-|i;(0,x)|/2), (95) 
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since m > n/2 and |i?(0,x)| < n. 

We have that L'^^^5{0) < maxj=i^,,,^m Wj, where 

Wi = ||Vi||" card(S(0; ||V,||)nZ^™,,An{y : ^* y}) l{Vi joined to and V, G B}. 

Let A^(x) denote the number of points of in B{0; ||x||) n {y : =^ y}. Then 

we obtain 



dx 

Jai' 



110 n 

E[Lf^.jW] <EY,Wt = m M'^E[{N{^) + l)^l{S(x)}]- 

By the Cauchy-Schwarz inequahty and the fact that m < 3\A\/2 by assumption, 

E[Lt^^J{Or] <^l^ M'-{E[{N{^) + l)8])V2p[^(x)]i/2dx. (96) 

The mean of A^(x) is bounded by a constant times ||x|p so E[{N(x) + 1)^] = 
0(max(||x||^^, 1)). This fohows from the binomial moment generating function for 
Bin(n,p), from which we have for (5 > that E'fX^] < ki{E[X])^ if pn > 1 and 
^^[X^] < k2E[X] if pn < 1, for some constants fei, /c2 > 0. 

Combined with ^ and this shows that E[Lf^^J{0)'^] is bounded by 
a constant times 

[ ||xf°+Sexp(-C||x||/4)dx+ / ||xf"dx, 

Ae_B:||x||>l JxeB:||x||<l 

which is bounded by a constant that does not depend on the choice of {A, B). 

We need to consider Lq only when G i?. For x G with x 0, let -E'(x) 
denote the event that i?(x, 0) is empty (i.e., contains no point ofUm-i,A)- By 
and ()95() . for G i? we have 

^[L^"] < m I ||xf"P[E'(x)]'^'' 

JxGA:x=5;*0 



3 

< - 
- 2 



/ ||xf°exp(l - C7||x||/2)dx+ / 

JxGyl:x=<;*0,||x||>l Jx 



Ixll^^dx 



xGA;x=f;*0,||x||<l 



which is bounded by a constant. Thus /S.'^iJAm,A\ B) has bounded fourth moment. 

Now consider the set of deleted edges. As at (|7^ . let d{'x;Um,A) denote the 
distance from x to its directed nearest neighbour in Um,A, or zero if no such point 
exists. Again use -E'(x) for the event that x becomes joined to on the addition of 
the origin, and let E"(Vi) := E(Vi) n {V, G B}. Then 



m ni m m 



E[^-{Um,A-. Bf] = E E E E E[d{Y,-U^^ATd{YyMm,AT 
i=l j=l k=l 1=1 

xd{Yk-Mm,ATd{^fMm,AT'^{E"{^i) D E"{Vj) n E"{Vk) n ^"(V,)}]. (97) 
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For i,j,k,£ distinct, the {i,j,k,i)th term of (|U7j) is bounded by 

[ [ [ [ ^^^^i?[d„„4(w)°d„_4(x)-d„_4(y)°d„^-4(z)'^ 

JbJbJbJb n n n n 

Xl{^^_4(w) n ^™_4(x) n Em-A{y) n Em-i{'L)}l (98) 

where dm-i{^) '■= d{x,Um~4,A U {w,x, y,z}) (using the notation of l(7^ ). and 
-Em-4(x) is the event that is the directed nearest neighbour of x in the set Z^m-4,AU 
{0,x}. 

Let /m-4(x) denote the indicator variable of the event that x is a minimal 
element of i^m-4,AU{x}. An upper bound for dm-4(x) is provided by d(x;ZYm-4,AUx) 
except when this is zero, so that 

dm-4(x)^" < (i(x;Z^„_4,AU{x})8" + d(x;{w,x,y,z})8"/^_4(x). (99) 

For X £ B, it can be shown, by a similar argument to the one used above for Lq, 
that there is a constant C such that 

E[{d{x;Um-A,A U {x}))8"] < C'. (100) 

Moreover, if w £ A with w ^ x and ||x — w||=t>0, then by a similar argument 
to that at (inSI), and we have that 

£;[/,„_4(x)] <exp(4- |i?(w,x)|/2) < exp(4 - Ct/2), t> 1, 

and hence, uniformly over A, B and {w, x, y, z} d A with x G i?, we have 

£;[d(x;{w,x,y,z})8"I^_4(x)] < max (sup (t^" exp(4 - C7t/2)) , 1 

L i>i 

Combining this with pOOj) . we see from that E[dm-i{^^'^] is bounded by a 
constant. Also, by a similar argument to (jUSJ) and (|94() . it can be shown that 
P[ii^m-4(x)] < exp(4 — C||x||/2) for ||x|| > 1. Therefore, by Holder's inequality, the 
expression is bounded by a constant times 

n-^y j j y"dwdxdydzexp(-C(||w|| + ||x|| + ||y|| + ||z||)/16) 

and therefore is 0(n~^). Since the number of distinct {i,j,k,£) in the summation 
1)9 7|) is bounded by m^, and hence by (3/2)^n^, this shows that the contribution to 
(|97|) from i,j,k,i distinct is uniformly bounded. 

Likewise, the number of terms {i,j,k,i) with only three distinct values (e.g., 
i = j with i,k,£ distinct) is O(n^). Such a term is bounded by an expression like 
(jHEI) but now with a triple integral, which by a similar argument is 0{n~^). Hence 
the contribution to H97() of these terms is also bounded. Similarly, the contribution 
to (|97j) from {i,j,k,£) with two distinct values has 0{v?) terms which are 0(n~^), 
and so is bounded. Likewise the contribution to H97() from terms with i = j = k = i 
is bounded. Thus the expression (j97j) is uniformly bounded. 
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Hence /\iJArn,A] B) has bounded fourth moments, uniformly in A, B, m. □ 



Proof of Theorem 16.11 By Lemmas 16.11 16.21 16.31 and the fact that £° is homo- 
geneous of order a, we can apply Corollary 14. H taking Rq := (0,1]^ and 5o,n := 
(n^-i/^ 1]2, to obtain Theorem 01 □ 

Remark. An alternative method for proving central limit theorems in geometrical 
probability is based on dependency graphs. Such a method was employed by Avram 
and Bertsimas to give central limit theorems for nearest neighbour graphs and 
other random geometrical structures. A general version of this method is provided 
by ^ni- By a similar argument to one can show that, under the total weight 
(for a > 2/3) of edges in the MDST from points in the region (e„, 1)^ (for e„ given 
below) satisfies a central limit theorem, where 




Such an approach can be suitably adapted to show that a central limit theorem 
also holds under the more general partial order specified by 6, (p, in the region 
{sn, 1 — Gn)^- The benefit of this method is that it readily yields rates of convergence 
bounds for the CLT. The martingale method employed has the advantage of yielding 
the convergence of the variance. 



7 The edges near the boundary 

Next in our analysis of the MDST on random points in the unit square, we consider 
the length of the edges close to the boundary of the square. The limiting structure 
of the MDSF and MDST near the boundaries is described by the directed linear 
forest model discussed in Section |31 

Initially we consider the 'rooted' case where we insert a point at the origin. Later 
we analyse the multiple sink (or 'unrooted') case, where we do not insert a point at 
the origin, in a similar way. 

Fix a E (1/2,2/3). Let Bn denote the L-shaped boundary region (0,1]^ \ 
(77,"°", 1]^. Recall from ()89() that C'^{X;R) denotes the contribution to the total 
weight of the MDST on X from edges starting at points oi X f] R. When X is a 
random point set, set £°(A'; R) := £°(A'; R) - EC^i^X; R). 

Theorem 7.1 Suppose the partial order is Then as n —> 00 we have 

£°(pO;i3„)^^i'^ + ^i'^ («>1); (101) 

where Di^\ are independent random variables with the distribution of Dq, 

given by the fixed-point equation |^ for a = 1 and by (0) for a > 1. Also, as 
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n oo, 

C'^iVn, Bn) ^ fP + Fi^^ (a > 1); (103) 
C^{X^-Bn)^Fi^^ + Fi^^ (a>l), (104) 

where F^^^^ , F^^'^ are independent random variables with the same distribution as 
Di for a = 1 and with the distribution given by the fixed-point equation Q) for 
a > 1. Also, as n ^ oo, 

^(a-l)/2^a(p^. B^)JU0 (0 < « < 1); (105) 
^{a-l)/2^a(p0. Bn)-^0 (0 < a < 1). (106) 

The idea behind the proof of Theorem 17. II is to show that the MDSF near each 
of the two boundaries is close to a DLF system defined on a sequence of uniform 
random variables coupled to the points of the MDSF. To do this, we produce two 
exphcit sequences of random variables on which we construct the DLF coupled to 
Vn, a Poisson process of intensity n on (0, 1]^, on which the MDSF is constructed. 

Let be the rectangle {n~^,l] x {0,n~°'], let Bn be the rectangle {0,n~^] x 
(n"'", 1], and let be the square (0,n~'^]2; see Figure IHl Then Bn = B^UB^UB^. 
Define the point processes 



o 



Figure 5: The boundary regions 



:= Vn n {B: U 5^), -.= Vn n (S^ U S^), and := Vn n Bl (107) 

Let := card(V^), Nfi := card(V^^) and iV° := card(V°). List in order of 
increasing y-coordinate as Xf , i = 1,2, . . . , N^. In coordinates, set Xf = {Xf,Yf) 
for each i. Similarly, list Vn in order of increasing x-coordinate as X-' = {Xf,Y^^), 
i = l,...,NK. Set = {Xf,i = 1,2,. ..,N^) and = (y/,i = 1,2,..., N^). 
Then and Un are sequences of uniform random variables in (0, 1], on which we 
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may construct a DLF. Also, we write Un^ for the sequence {0, Xf , X2 , . . . ^Xfj^^), 
and Un^ for the sequence {Q,Y^ , . . . , Y^jy). 

With the total DLF/DLT weight functional D°'{-) defined in Section for ran- 
dom finite sequences in (0, 1), the DLF weight D"^{U^) is coupled in a natural way to 
the MDSF contribution £"(V^), and likewise for and £"(V^^), for 

and £°(V^' U {0}), and for D'^luK'^) and £"(V^^ U {0}). 

Lemma 7.1 For any a > 1, as n ^ 00, 

£°(V^) - D'^iU^) 0, and £"(V^) - D'^{U^) 0; (108) 

C"{V^ U {0}) - D^{U^'^) 0, and U {0}) - Z)"(Z^^'°) ^ 0. (109) 

Further, for < a < 1, as n ^ 00, 



E 



£"(V^) --D"(Z^^)r = O (n2-2— 2"-) , (110) 



and i/ie corresponding result holds for Vn andUn, and for the rooted cases (with the 
addition of the origin). 

Proof. We approximate the MDSF in the region Bn by two DLFs, coupled to the 
MDSF. Consider V^; the argument for Vn is entirely analogous. 

We have the set of points Vi^ = {{Xf ,Yf),i = 1, . . . , A''^}. We construct the 
MDSF on these points, and construct the DLF on the x-coordinates, Uf^ = {Xf , i = 
1, . . . , Nf). Consider any point {Xf, Yf). For any single point, either an edge exists 
from that point in both constructions, or in neither. Suppose an edge exists, that 
is suppose Xf is joined to a point X^^.y D{i) < i in the DLF model, and {Xf, Yf) 
to a point {X^f^^yY^^-^) in the MDST (we do not necessarily have N{i) = D{i)). 
By construction, we know that \Xf — A|,^^j| < \Xf — since N{i) < i by the 

order of our points. It then follows that 

W I vx \rx \ I vx ^/-x \\\a \ \ vx \rx \a \ I vx \rx I a 
^^i ) - {-^N{i)^^N{i))\\ ^\^i~^N{i)\ ^\^i~^D{i)\ ' 

and so we have established that, for all a > 0, 

D^{Ufi) < £"(V:); and D-{Uf;'^) < L^{Vf U {0}). 
Now, by the construction of the MDST, we have that 

\\{xf,Yn - (x^(,),y^(,))|| < ||(xf,y,-) - (xS(,),y^(,))||. (in) 

If {x,y) € (0, 1]2 then ||(3;,y)|| < x + y, and by the Mean Value Theorem for the 
function t^ t^ , for a > 1, 

||(x,2/)|r - < (x + yr - < a2-^ (« > 1). 
Hence, for a > 1, 

\\{Xf,Y^) - (Xf.(,),y^(,))|r - {Xf - Xl^ < a2^-\Yt - Y^^^). (112) 
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Then (|TTT]) and TTTHi yield, for a > 1, 

\\{Xf,Yn - (X^(,),y^(,))r - {Xf - < a2-~\Yf - Y^,^^). 

Hence, for a > 1, 

AT a: 
n 

< - D^U:) < a2-i Y.^Yt - Y^^^^). 

i=l 

Thus, for a > 1, 

< - D'^iU^) < a2"-i7V;>-'^; 

and < C^iV^ U {0}) - < a2"-^iV>-'^. (113) 

We have ~ Po [n^~^), so that since o" > 1/2, we have 

^[(£°(V^ U {0}) - D^{U^'°)f] < a222"-2^-2a^j(^x^2] ^ ^ > 

An entirely analogous argument leads to the same statement for Un and Vn, and we 
obtain (|lfl8j) . and (|109j) in identical fashion. 

We now consider < a < 1. By the concavity of the function t t"' for q < 1, 
we have for x > 0, y > that 

\\{x,y)r-x'^ < (x + y)" -x" < y° (0 < a < 1). 

Then, by a similar argument to (|113|) in the a > 1 case, we obtain 

Then ()11U() follows since ~ Po (n^""") , and the rooted case is similar. □ 

Lemma 7.2 Suppose Di has distribution given by Da, a > 1, has distribution 
given by and Fa, a > 1, has distribution given by Then as n ^ oo, 

C\V:U{0})^D,, and C^V^) ^ D^; (114) 

£"(V^ U {0}) ^ and £"(V^)^F„ (a > 1). (115) 

Moreover, ill4\ ) o-nd \115\) also hold with replaced by Vn- 

Proof. As usual we present the argument for only, since the result for Vn follows 
in the same manner. First consider the a > 1 case. We have the distributional 
equality 

C (D"(Z^:'°)| K = m)=C {D^iUD) ■ C {D%U:)\ K = m)=C [D^Unr)) • 
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But is Poisson with mean v} and so tends to infinity almost surely. Thus 

by Theorem ED (ii), D'^iUn'^) ^ and D''{U^) as n ^ oo, and so by 

Lemma l7. II and Slutsky's theorem, we obtain 

£"(V^ U {0}) ^ and £"(V^) ^ F„ as n ^ oo. (116) 

Also, ElD'^iUn'^)] (a - 1)"^ by ((2H), so by Lemma O and Proposition E31 
S[£"(V^ U {0})] ^ (a - l)-i = £;[Z)a]- Similarly, by (jSHJ, Lemma lO and Propo- 
sition El ^[£"(V^)] ^ (a(a - 1))-^ = E[Fa]. Hence, (tTTTIll stiU holds with the 
centred variables, i.e., (|115() holds. 

Now suppose a = 1. Since is Poisson with parameter n^~'^, Lemma 13.71 (i). 

with t = n^-", then shows that D^{Un' ) — > Di as n ^ oo. Slutsky's theorem 

with Lemma l7. II then implies that £^(Vi^U{0}) Di. In the same way we obtain 

CMy^) ^ Di, this time using part (ii) instead of part (i) of Lemma 13.71 along 
with Proposition 13.71 □ 

Note that D°^{l/(^) and D'^iUn) are not independent. To deal with this, we define 
V- := Vn n and Vl := Vn n B^. 

Also, recaU the definition of V° at (fTUTl) . Let := card(V^) and NK := card(V^). 
Since and Bn are disjoint, £"(V^^') and C^iVn) are independent, by the spatial 
independence property of the Poisson process Vn- 

Lemma 7.3 Suppose a > 0. Then: 

(i) As n ^ oo, 

- £"(V:) ^ 0, and L^iyl) - C^iyl) ^ 0; (117) 

a^iyi u {0}) - £-(K u {0}) ^ o, and a^iyi u {o}) - L-iyi u {o}) ^ o.(ii8) 

(ii) Asn-^oo, we have £"(V°) ^ 0, and £"(V° U {0}) 0. 

Proof. We first prove (i). We give only the argument for V^; that for Vn is 
analogous. Set A := /:"(V^) - £"(!>„)■ Let [3 = [a + (l/2))/2. Then 1/2 < /3 < a. 

Assume without loss of generality that Vn is the restriction to (0, 1]^ of a ho- 
mogeneous Poisson process Hn of intensity n on R^. Let X~ = {X~ ,Y~) be the 
point of Tin H ((0,n~^] x (0,oo)) with minimal y-coordinate. Then X~ is uniform 
on (0,n~^]. Let En be the event that X~ > 3n~'^; then = 3n^~'^ for n large 

enough. 

Let Ai be the the contribution to A from edges starting at points in (0, n"^] x 
(O,?!"*^]. Then the absolute value of Ai is bounded by the product of {^/2n~^)°' 
and the number of points of Vn in (0, n~^] x (0, n~°^]. Hence, for any a > 0, 

E[|Ai|] < (V2n-^)°^ card(^P„n((0,n-^] X (Cn--^])) 
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Let A2 := A — Ai, the contribution to A from edges starting at points in 
{n~^, 1] X (0,n~'^]. Then by the triangle inequality, if En occurs then these edges 
are unaffected by points in B^, so that A2 is zero if En occurs. Also, only minimal 
elements of 7^„n (n~^, 1] x (0, n~'^] can possibly have their directed nearest neighbour 
in (0, n~'^] X (0,n~'^]; hence, if M„ denotes the number of such minimal elements 
then IA2I is bounded by 2°/^M„. Hence, using (|5H|) . we obtain 

£;[|A2|] < 2"/2p[^^]£;[M„] = 0(n^-'^ log ?i) 

which tends to zero. Combined with (|119|) . this gives us (|117j) . The same argument 
gives us ()118|) . 

For (ii), note that 

E [/:"(V°)] < (^n~")"E[iV°] = 2°/2n^-2— ™ ^0, as n ^ 00, 
for any a > 0. Thus £"(V°) 0, and similarly £"(V° U {0}) 0. □ 

In proving our next lemma (and again later on) we use the following elementary 
fact. If N{n) is Poisson with parameter n, then as n — > 00 we have 

E[\N{n) - n| logmax(iV(n),n)] = 0(n^/^ log (120) 

To see this, set Yn := \N{n) — n\ logmax(A^(n), n). Then Ynl^N{n)<2n} ^ \N{n) — 
n\ log(2n), and the expectation of this is 0(?7-^/^logn) by Jensen's inequality since 
Var (iV(n)) = n. On the other hand, the Cauchy-Schwarz inequality shows that 

E[Ynl{N{n)>2n}] ^ 0, and folloWS. 

We now state a lemma for coupling Xn and Vn- The q > 1 part will be used in 
the proof of Theorem 17. 11 The < a < 1 part will be needed later, in the proof of 
Theorem 12.21 As in Section |21 let So^n denote the 'inner' region (n^-V2,l]2, with 
e S (0, 1/2) a constant. The boundary region i?„ is disjoint from Sq^, let C„ denote 
the intermediate region (0, 1]^ \ {Bn U So^n), so that Bn U C„ = (0, 1]^ \ S'o,n- 



•0 (121) 



0. (122) 

(123) 

(124) 
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Lemma 7.4 There exists a coupling of Xn and Vn such that: 

(i) For < a < 1, provided e < (1 — a)/2, we have that as n 

n("-i)/2i?[|£-(^„; Bn U C„) - £-(P„; i?„ U C„)|] 

and 

n("-^)/2^[|£"(^°; Bn U C„) - £"{V'n, Bn U C„)|] 

(ii) For a > 1, we have that as n ^ oo, 

E[|£"(;f„;B„)-£-(P„;i?„)|] ^0 

and 

E[\C^{XlBn)-C^{VlBn)\]^^. 



Proof. We couple Xn and Vn in the following standard way. Let Xi , X2 , X3 , . . . 
be independent uniform random vectors on (0,1]^, and let N{n) ~ Po(n) be in- 
dependent of (Xi,X2, . . .). For m G N (and in particular for m = n) set Xm '■ = 

{Xi, . . . ,Xm}; set Vn ■= {Xi, . . . ,X7v(n)}- 

For each m G N, let Ym denote the in-degree of vertex X^ in the MDST on 
Xm. Suppose Xm = X. Then an upper bound for is provided by the number of 
minimal elements of the restriction of Xm-i to the rectangle {y € (0, 1]^ : x ^* y}. 
Hence, conditional on X^ = x and on there being k points of Xm-i in this rectangle, 
the expected value of Ym is bounded by the expected number of minimal elements 
in a random uniform sample of k points in this rectangle, and hence (see ()86() 1 
by 1 + log/c. Hence, given the value of X^, the conditional expectation of Y^, is 
bounded by 1 + logm. 

First we prove the statements in part (i) (0 < a < 1). Suppose e < (1 — a)/2. 
Then 

\C''{Xm; Bn U Cn) " £"(^„^-l; U C„)| < 2''l^{Ym + l)l{^ra G U Cn}- (125) 

Since i?„ U C„ has area 2n^~^/^ — n^*""^, we obtain 

E[{Ym + 1)1{X„, G 5„ U Cn}] < (2 + logm)2n^-V2. 

Hence, by (|125|) there is a constant C such that 

n("-i)/2i^[(|£"(P„; Bn U C„) - £"(A'„; Bn U C„)|)|iV(n)] 
< C\N{n) - n\ log(max(iV(n), n))n("+2e-2)/2^ 

and since we assume a + 2e < 1, by (|12fl|) the expected value of the right hand side 
tends to zero as n ^ oo, and we obtain (|121|) . Likewise in the rooted case (|122)) . 
Now we prove part (ii). For a > 1, we have 

|£"(A'„; Bn) - £"(A'm_i; < 2"/2(y„ + l)l{Xm G Bn}. (126) 

Since Bn has area 2n~" — n"^'^, by (|126() there is a constant C such that 

E[(|£"(P„; Bn) - a^iXn, Bn)\)\N{n)] < C\N{n) - n\ log(max(7V(n), n))n-^ 

and since a > 1/2, by H12U|) the expected value of the right hand side tends to zero 
as n — > cx), and we obtain (|12,3j) . We get (|124j) similarly. □ 

Proof of Theorem I7.1L Suppose q > 1. We have that 

&{V^J = + (£"(K) - £"(V:)). 

The final bracket converges to zero in probability, by Lemma f7.3l (i). Thus by Lemma 

17.21 and Slutsky's theorem, we obtain £°(Vi^) — > (where we have Fi = Di). 
Now 

£'^(v^) + c^m = c^m + c%vi) + {c^m - £"(v:)) + {C^VD - C^VD). 
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The last two brackets converge to zero in probability, by Lemma 17.31 (i) . Then the 
independence of £"(V^) and £°(Vn) and another application of Slutsky's theorem 
yield 

where Fa^^ and Fa^^ are independent copies of Fa- Similarly, 

£"(V^ U {0}) + £"(V^ U {0}) ^ Di^^ + Di'\ 

Finally, since = £°(V^) + C°'{Vii) - >C"(V°) (with a similar statement 

including the origin) Lemma 17.31 (ii) and Slutsky's theorem complete the proof of 
(HnH and (fimi) . 

To deduce (|102|) and (|lfl4j) , assume without loss of generality that Xn and Vn are 
coupled in the manner of Lemma mi Then C^iVn] Bn) — C°'{Xn', Bn) tends to zero 
in probability by H123|) . and C°'{V^;Bn) — C°^{X^;Bn) tends to zero in probability 
by (|124j) . Hence by Slutsky's theorem, the convergence results and pOHj) carry 
through to the binomial point process case, i.e., (|1U2|) and ()1U4() hold. 

Now suppose < a < 1. Then gives us 



E 



n 



{q-1)/2 ( ra(-^)X 



{C^{Vl)-D^{K)) 



which tends to as n ^ oo, since a > 1/2. Likewise for the rooted case, 



E 



^{a-l)/2 ^^a^^x ^ _ D"(U^'^)) f =0 



(127) 



(128) 



By Proposition 13.21 we have 

_E[n(°-i)/2D"(W;^)] = 0(n("-i^/2^[(iV;?)^-"]) = 0(?i("-i)('^-i/2)) ^ q, 
and combined with p27j) this completes the proof of (|l()5j) . Similarly, by Proposition 

EH 

^[j^{a-l)/2^a(^x,0)] ^ 0(n(°-l)/2^[(iV^)l-"]) = 0(?l("-l)('^-l/2) ) ^ Q, 

and combined with (|128|) this gives us (|106j) . □ 



8 Proof of Theorem |2I2 

Let a e (1/2, 2/3). Let e > with 

e < min(l/2, (1 - <t)/3, (3 - 4ct)/10, (2 - 3cr)/8). (129) 

In addition, if < a < 1, we impose the further condition that e < (1 — a)/2. As 
in Sectional denote by 5o,n the region (n"^"^/^, 1]^. As in Sectional let Bn denote 
the region (0, l]^ \ {n''^, 1]^, and let Cn denote (0, 1]^ \ {Bn U So,n)- 
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We know from Sections IHl and [7| that, for large n, the weight of edges starting 
in So^n satisfies a central limit theorem, and the weight of edges starting in Bn can 
be approximated by the directed linear forest. We shall show in Lemmas 18.21 and 
18.31 that (with a suitable scaling factor for a < 1) the contribution to the total 
weight from points in C„ has variance converging to zero. To complete the proof of 
Theorem l2.2l in the Poisson case, we shall show that the lengths from Bn and So^n are 
asymptotically independent by virtue of the fact that the configuration of points in 
Cn is (with probability approaching one) sufficient to ensure that the configuration 
of points in Bn has no effect on the edges from points in So^n- To extend the result to 
the binomial point process case, we shall use a de-Poissonization argument related 
to that used in |17j . 

First consider the region C„. We naturally divide this into three regions. Let 

Q := (n^-i/2^ 1] X {n~^,n'-^/\ CI := (n"", n^-^/^j ^ (n^-i/2^ i]^ 

Cl:= (n--,n^-i/2]2. 

Also, as in Sectional let 

Bl := (n-^ 1] X (0, By := (0, n'^ x (n-^ 1], 1?° := (0, n^^^f. 

We divide the Cn and Bn into rectangular cells as follows (see Figure IHl) We leave 
Cn undivided. We set 

kn := Lni-^-2^J (130) 

and divide lengthways into kn cells. For each cell, 

width = (1 _ ^2£+<T-i. height = _ ^-a _ ^£-i/2_ ^^3^^ 

Label these cells Tf for i = 1,2, ... ,kn from left to right. For each cell F^, define 
the adjoining cell of -B^, formed by extending the vertical edges of Tf, to be Pf. 
The cehs jSf then have width (1 — 7f~'^/'^) /kn ~ v?^^'^~^ and height n~" . 

In a similar way we divide Cn into kn cells F^ of height (1 — n^~^/'^) /kn and 
width — n~'^ , and divide Bn into the corresponding cells 0f , i = 1, . . . , kn- 

For 2 = 2,..., kn, let Ex^i denote the event that the cell (3f_i contains at least 
one point of Vn, and let Ey^i denote the event that contains at least one point 

oiVn- 

Lemma 8.1 For n sufficiently large, and for 1 < j < i < kn with i — j > 3, if E^^i 
(respectively Ey^i) occurs then no point in the cell F? (respectively F^ ) has a directed 
nearest neighbour in the cell F| or (Tj or ). 

Proof. Consider a point X, say, in cell Tf in C!^. Given Ex^i, we know that there 
is a point, Y say, in the cell Pf_i to the left of the Pf cell immediately below Tf, 
such that Y =4* X, but the difference in x-coordinates between X and Y is no more 
than twice the width of a cell. So, by the triangle inequality, we have 

\\X - Y\\ < 2(1 - n'-'^/^)/kn + 7i'-^/^ ~ 2n2^+^-\ (132) 
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Figure 6: The regions of [0, 1]^. 

since a > 1/2. Now, consider a point Z in a cell F^' or /3J with j < « — 4. In this 
case, the difference in x-coordinates between X and Z is at least the width of 3 
cells, so that 

\\X - Z\\ > 3(1 - n'^^/^)/kn ~ 3n2=+"-^ (133) 

Comparing (|132|) and (|133|) . we see that X is not connected to Z, which completes 
the proof. □ 

Recall from (jSHJ that for a point set 5 C and a region R C R^, C°'{S;R) 
denotes the total weight of edges of the MDSF on S which originate in the region 
R. 

Lemma 8.2 As n ^ oo, we have that 

Var[£"(P„;C„)] ^0 and Var[£"(pO; ^ (a > 1); (134) 

Var[n("-^)/2^"(P„; Cn)] ^0 (0 < a < 1); (135) 

Var[n("-^)/2^"(P°; Cn)] ^0 (0 < a < 1). (136) 
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Proof. For ease of notation, write Xi = C^iVnJf) and Yi = £"(P„;rf), for 
i = 1,2,..., kn- Also let Z = £°(P„; C°). Then 



Var[£"(P„;a)] 



Var 



Z + Y,X, + Y,Yi 



i=l 



i=l 



(137) 



Let Nf , Nf , Nq, respectively, denote the number of points of Vn in ^f, T^, C^, 
respectively. Then by (|131|) . iVf is Poisson with parameter asymptotic to n^^^°^~^/^, 
while Nf + Nf + Nq is Poisson with parameter asymptotic to 2n^^^'^~^^'^; hence as 
n — > cxo and we have 



~ n 



6e+2cr-l 



, E[{Nf + Nf + Nof] ~ 4n' 



6£+2(T-l 



(138) 



Edges from points in Tf CiT^n are of length at most 2n'^'^^"' ^, and hence, 



Var[Xi+yi + Z] < (2n"^+""^)"°E[(iVf + A^i' + iVoj 

^ 22+2a^6e+2o--l+2«{2e+(T-l) 



(139) 



For a > 1, since e is small (|129() . the expression H139() is 0(n^'^^~^'^'^ ^) and in fact 
tends to zero, so that 



Var(Xi + Yi + Z)^0 (a > 1). 



(140) 



By Lemma Is. II and p32|) . given Ex^i, an edge from a point of can be of length 
no more than 3n^^+'^~^. Thus using ()138() we have 



Var[Xa{i?x,0] < E[Xfl{Ex,,}] < {3n^^+'^-'f''E[{Nn^] 

= (9(^6e+2o--l+2Q{2e+cr-l)-j 



(141) 



Next, observe that Cov[Xil{Ex^i} , Xjl{Exj}] = for i — j > 3, since by Lemma 
18. H Xil{Ex^i} is determined by the restriction of Vn to the union of the regions 
T^up^,i-3<e<i. Thus by (fllTO . Cauchy-Schwarz and (tnTl) . we obtain 



Var 



.1=2 



J^Var[Xa{^.4] 



j=2 



+E E cov[xa{i?.,j>^ji{^x,i}] 

«=2 i:l<|j-i|<3 
= 0(n^"+'"+2a(2£+a-l))_ ^^42) 

For a > 1, the bound in 1)142^ tends to zero as n — > oo, since 1/2 < o" < 2/3 and e 
is small XVM. 



By (USni), the cehs pf 



1, . . . ,kn, have width asymptotic to n 



2e+cr-l 



and 



height n so the mean number of points of Vn in one of these cells is asymptotic 
to n^*^; hence for any cell (3f or Pf , i = l,...,kn, the probability that the cell 
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contains no point of Vn is given by exp{— +o(l))}. Hence for n large enough, 
and i = 2, . . . , kn, we have P[E^^^] < exp(— n"^), and hence by (|138|1 . 

Var[Xa{El,}] < E[Xf\El,]P[El,] < 2"i?[(iVf ) 

= 0(n6^+2'^-^exp(-n")). (143) 

Hence by Cauchy-Schwarz we have 



Var 



1=2 



Y,y^4Xa{El,}]+Y,Cov[Xa{Eli},Xjl{El^}] 

i=2 i^j 

O {kln^'^'^''-^ exp(-n")) ^ 0, (144) 



as n — > oo. Then by (|142j) . (|144j) . and the analogous estimates for Yi, along with 
the Cauchy-Schwarz inequality, we obtain for a > 1 that 



Var 



kfi kfi ^TL fcrt 

J2 x^i{E,,} + J2 y^HEy^i} + E ^MEii} + E YM^i,} 

.1=2 i=2 1=2 1=2 



0, (145) 



as n — > oo. By 1)137^ with 1)140^ . p45p . and Cauchy-Schwarz again, we obtain the 
first part of (| 134)1 . The argument for Vn is the same as for Vn, so we have ()134|) . 

Now suppose < a < 1. We obtain ()135() and 1)136^ in a similar way to ()134() . 
since (|139|) implies that 

Var(n("-^)/2(;^^ ^Y^ + Z)) = o(n6-+2-2+"(4^+2-i)) 
and ()142p implies 



Var |^n(-^)/2 = 0(r 



^4e+o--l+a(4e+2<T-l) 



and both of these bounds tend to zero when < a < 1, 1/2 < a < 2/3, and e is 
smah (112911. □ 



To prove those parts of Theorem 12 . 21 which refer to the binomial process Xn-, we 
need further results comparing the processes Xn and Vn when they are coupled as 
in Lemma 17.41 

Lemma 8.3 Suppose a > 1. With Xn and Vn coupled as in Lemma \7.4[ we have 
that as n ^ oo 

£°(;f„;C„)-£"(J'„;C„) ^0 and £"(A'0; C„) - £"(pO; C„) ^ 0. (146) 

Proof. Let Vn and Xm {m G N) be coupled as described in Lemma mi Given n, 
for m G N define the event 

Em,n := ni<^<kA{Xm-i H pf / 0} H {Xn,.i H /3f / 0}), 
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with the sub-ceUs f3f and of i?„ as defined near the start of Section |H1 Then by 
similar arguments to those for P[E^ J above, we have 

P[E^,^n] = 0(n^-'"-2^ exp(-7i72)), m>n/2 + 1. 

As in the proof of Lemma 17.41 let denote the in-degree of vertex in the 
MDST on Xm- Then 

|£"(A'„,;C„) - < {Y^+l)l{Xm G C^} ((3n2^+-i)° + 2''/H{E'^^J^ . 

Thus, given N{n), 

Since has area less than 2n^~^/^, by 1)86^ there exists a constant C such that, 
for n sufficiently large and N{n) > n/2 + 1, 

i?[(|£"(A'„;C„)-£-(P„;C„)|)|iV(n)] <2°/2nV(„)<„/2+i} 
+C\N{n) - n\ log(max(iV(n),n))n-(2^+'^-i)+^-i/2l|^(„)>„/2+i}. (147) 

By tail bounds for the Poisson distribution, we have nP[N{n) < n/2 + 1] — > as 
n — >■ DO, and hence, taking expectations in (|147() and using (|12U() . we obtain 

E Cn) - /:"(P„; Cn)\] = 0(n"(2-+'^-i)+^ logn) + o(l), 

which tends to zero since q>1, l/2<cr<2/3 and e is small (see p29|) ). So we 
obtain the unrooted part of 1)146^ . The argument is the same in the rooted case. □ 

Lemma 8.4 Suppose Xn and Vn are coupled as described in Lemma \ 7.4[ with 
N{n) := card('P„). Let A(oo) be given by Definition \41\ with H = C^, and set 
ai := -E[A(cx))]. Then as n —> oo we have 

jC\Vn; 5o,„) - C\Xn; So,„) - n-'/^ai{N{n) - n) ^ 0; (148) 
CHV^; So,n) - CHX^; 5o,n) - n~^/^ai{N{n) - n) ^ 0. (149) 

Proof. The proof of the first part 1)148(1 follows that of eqn (4.5) of J7]i using our 
Lemma [4.5l and the fact that the functional is homogeneous of order 1, is strongly 
stabilizing by Lemma l6. 11 and satisfies the moments condition (|62|1 by Lemma 16.31 
As shown in the proof of Corollary 16. II fsee in particular eqn (jHJ)), we have that 
C^{V^; So^n) - C^iVn, So^n) Converges to zero in and C ^{X^] S'o.n) - C}{X n] So^n) 
converges to zero in L^. Therefore the second part (|149() follows from (|148() . □ 

We are now in a position to prove Theorem 12.21 We divide the proof into two 
cases: a / 1 and a = 1. In the latter case, to prove the result for the Poisson 
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process Vn-, we need to show that C^{Vn', Bn) and £^(7^„;5o,n) are asymptoticaUy 
independent; hkewise for V^- We shall then obtain the results for the binomial pro- 
cess Xn and for from those for Vn and via the coupling described in Lemma 

El 

Proof of Theorem 12.21 for a ^ I. First suppose < a < 1. For the Poisson case, 
we have 

W"-^)/2£"(Pn;C„). (150) 

The first term in the right hand side of (|15()|) converges in distribution to AA(0, s^) 
by Theorem 16.11 (iv) , and the other two terms converge in probability to by eqns 
(|l()5j) and (|lH5j) . Thus Slutsky's theorem yields the first (Poisson) part of (|TT|) . To 
obtain the second (binomial) part of (jllj) . we use the coupling of Lemma 17.41 We 
write 

^(a-l)/2^a(^^) = n(-^)/2^~"(^„; So,„) + n(-l)/2(£"(P„; Bn U Cn)) 

The first term in the right side of (|151j) is asymptotically AA(0,t^) by Theorem 16. II 

(ii) . The second term tends to zero in probability by (jlUSf) and 1)135^ . The third 
term tends to zero in probability by (|12H) . Thus we have the binomial case of pi|) . 

The rooted case © is similar. Now, for the first (Poisson) part of (jSJ, we use 
Corollarv l6.1l (iv) with (|106j) and (|136j) . and Slutsky's theorem. The second part of 
(jHl) follows from the analogous statement to 1)1511) with the addition of the origin, 
using Corollary (ii) with ()lfl6)) . ()l.S6j) . (|122j) . and Slutsky's theorem again. 

Next, suppose a > 1. We have 

£"(P„) = C^{Vn; So,n) + ^"(^n; Cn) + £°(Pn; Bn). (152) 

The first term in the right hand side converges to in probability, by Theorem 16. II 

(iii) . The second term also converges to in probability, by the first part of ()134() . 
Then by ()lfl3)) and Slutsky's theorem, we obtain the first (Poisson) part of (|13|) . To 
obtain the rooted version, i.e. the first part of we replace Vn by P° in (IT^ . 
and combine ()1U1() with Corollarv 16.11 fiii) and the second part of 1)134(1 . and apply 
Slutsky's theorem again. 

To obtain the binomial versions of the results (|1U() and ()13|) . we again make use 
of the coupling described in Lemma 17.41 We have 

^°'{Xn) = L°'{Xn\ So^n) + C,'^ [Xn] C„) + £"(A'„; Bn)- (153) 

The first term in the right hand side converges in probability to zero by Theorem 
16.11 (i) . The second term converges in probability to zero by the first part of ()134jl 
and the first part of ()146() . The third part converges in distribution to Pi^^ + fJ^^ 
by by (|in4)l . Hence, Slutsky's theorem yields the binomial part of (fT3|l. 
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Similarly, by replacing Vn by and Xn by in ()153() . and using Corollary 16. II 
(i), the second part of (|134j) and of (|146|) . (|lfl2j) and Slutsky's theorem, we obtain 
the binomial part of ()1U() . This completes the proof for a ^ 1. 

Proof of Theorem 12.21 for a = 1: the Poisson case. We now prove the first 
part of © and the first part of ((T^ . Given n, set qn := 4[n^+'^~^/^J . Split each 
cell of into Aqn rectangular sub-cells, by splitting the horizontal edge into 
Qn segments and the vertical edge into 4 segments by a rectangular grid. Similarly, 
split each cell by splitting the vertical edge into qn segments and the horizontal 
edge into 4 segments. Finally, add a single square sub-cell in the top right-hand 
corner of C^, of side (l/4)n'^~^/^, and denote this "the corner sub-cell". 
The total number of all such sub-cells is 1 -|- Sk^qn ~ 

32n(V2)-£. Each of the 

sub-cells has width asymptotic to (l/4)n^~^/^ and height asymptotic to (l/4)n^~"^/^, 
and so the area of each cell is asymptotic to (l/16)?i^^~^. So for large n, for each 
of these sub-cells, the probability that it contains no point of Vn is bounded by 
exp(— n^). 

Let En be the event that each of the sub-cells described above contains at least 
one point of Vn- Then 

P[E^] = O fn(i/2)-e exp(-n^)) ^ 0. (154) 



Suppose X lies on the lower boundary of So^n- Consider the rectangular sub-cell of 

Tf lying just to the left of the sub-cell directly below x (or the corner sub-cell if that 

lies just to the left of the sub-cell directly below x). All points y in this sub-cell 

satisfy y x, and for large n, satisfy ||y — x|| < (3/4)?i^~^/^, whereas the nearest 

point to X in i?„ is at a distance at least (3/4)n'^~^/^. Arguing similarly for x on the 

left boundary of So^n, and using the triangle inequality, we see that if En occurs, no 

point in So^n can be connected to any point in Bn, provided n is sufficiently large. 

For simplicity of notation, set Xn ■= C^{Vn\ Bn) and Yn := C^{Vn\ S^^n)- Also, 

set X := d\^^ + and Y ~ AA(0,sf), independent of X, with si as given m 



Theorem 16.11 We know from Theorem 17.11 and Theorem 16.11 that Xn — > X and 
Yn — > y as n — > oo. 

We need to show that Xn + Yn — > X + Y, where X and Y are independent 
random variables. We show this by convergence of the characteristic function, 

£;[exp {it{Xn + Yn))] — > E[exp {itX)]E[exp (itY)]. (155) 

With uj denoting the configuration of points in Cn, we have 

E[exp{it{Xn + Yn))] = 1 E[(^^^"e^^^"\uj\dP{uj)+ E e**(^"+^")lj 

J En 

= I E [e^*^"] E [e^*^" | uj] dP{uj) + E 

J En 



where we have used the fact that Xn and Yn are conditionally independent, given 
UJ G En-, for n sufficiently large, and that X„ is independent of the configuration in 
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Cn- Then E[e'^(^"+^"hEc] ^ as ^ oo, since P[E^] 0. So 

E [exp {it{Xn + - E [e**^"] E [e^^^n^jj ^ 0, 

and we obtain (IT^ since E[e'^^"lEj = E[e'^^"] - E[e'^^"lE^], E[e'^^"lE^] 0, 
£;[eitx„] _^ E[e'^^], and £;[e**'^"] ^ E[e'*^] as n ^ oo. 

We can now prove the first (Poisson) part of p2|). We have the a = 1 case of 
H152p . The contribution from C„ converges in probabihty to by the first part of 
(|ll-{4|) . Slutsky's theorem and (|155j) then give the first (Poisson) part of (|12)) . The 
rooted Poisson case @ follows from the rooted version of (|152p . this time applying 
the argument for (fT55]l taking X„ := -B„), y„ := S'o,™) and X, y as 

before, and then using the second part of (\l'6^ and Slutsky's theorem again. Thus 
we obtain the first (Poisson) part of Q. 

Proof of Theorem 12.21 for a = 1: the binomial case. It remains for us 
to prove the second part of @ and the second part of (|12() . To do this, we use 
the coupling of Lemma 17.41 once more. Considering first the unrooted case, we 
here set Xn := C^{Xn;Bn) and y„ := {Xn, So^n) ■ Set := £^(7^„;fi„) and 

:= C^{Vn \ So^ri) (note that all these random variables are uncentred). 

Set Y ~ M{^,s\) with si as given in Theorem EH Set X := d\^^ + d\^\ 
independent of Y. Then by (|155() we have (in our new notation) 

X - EX'^ + yI,-EY;^X + Y. (156) 

By (fT23|) . we have X„ - and EXn - EX^ 0. Also, with qi as defined 

in Lemma 18.41 eqn ()148() of that result gives us 

Y^-Yn- n-^/^ai (iV(n) - n) (157) 

so that E[Y^^] — E[Yn] — > 0. Combining these observations with (|156j) . and using 
Slutsky's theorem, we obtain 

Xn - EXn + Yn- EY^ + n-^/^ai{N{n) -n)^X + Y. (158) 

By Theorem lH.il fiii) we have Var(l^) — > as n — > oo. By p57|) . and the indepen- 
dence of N[n) and Yn, we have 

si = lim Var[y„ + n~^/'^ai{N{n) - n)] = lim (Var[y„] + a\) (159) 

n— >oo n— ►oo 

SO that a\ < sf. Also, n~^^'^ai{N{n) — n) is independent of Xn + Yn, and asymp- 
totically AA(0, af). Since the AA(0, s^) characteristic function is exp(— s^t^/2), for 
all t € R we obtain from (|158|) that 

E[exp(zt(X„ - EXn + Yn- EYn))] ^ exp(-(s2 - al)t^ /2)E[eMitX)] 

so that 

Xn - EXn + Yn-EYn^X + W, (160) 
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where W ~ A^(0, — Oi), and W is independent of X. 

We have the a = 1 case of (|153|) . By the first part of (|134|) and the first part of 
H146|) . the contribution from C„ tends to zero in probability. Hence by ()16U() and 
Slutsky's theorem, we obtain the second (binomial) part of (|12j) . 

For the rooted case, we apply the argument for now taking X„ := 

C^{X^; Bn), Yn := C^{X^; So^n), with X, Y and W as before. The rooted case 

of (|15Hj) follows from the rooted case of (|155j) , and now we have X„ — — > and 
- EX'^ ^ by ((nH). In the rooted case (fTF7|) still holds by (fTl^ . and then 
we obtain the rooted case of (|160j) as before. 

To obtain the second (binomial) part of ®, we start with the rooted version of 
the a = 1 case of (|153|) . By the second part of (|1.S4|) and of (|146|) . the contribution 
from Cn tends to zero in probability. Hence by the rooted version of H16U|) and 
Slutsky's theorem, we obtain the second part of Q. 

This completes the proof of the a = 1 case, and hence the proof of Theorem 12. 21 
is complete. □ 

Acknowledgements 

The first author began this work while at the University of Durham, and was also 
supported by the Isaac Newton Institute for Mathematical Sciences, Cambridge. 
The second author was supported by the EPSRC. 

References 

[1] Avram, F. and Bertsimas, D. (1993) On central limit theorems in geometrical 
probability, Ann. Appl. Probab., 3, 1033-1046. 

[2] Bai, Z., Lee, S. and Penrose, M. D. (2004) Rooted edges in a minimal directed 
spanning tree, preprint. 

[3] Barndorff-Nielsen, O. and Sobel, M. (1966) On the distribution of the number 
of admissible points in a vector random sample, Theory Probab. Appl. 11, 249- 
269. 

[4] Bertoin, J. and Gnedin, A. (2004) Asymptotic laws for nonconservative self- 
similar fragmentations. Preprint, available from arXiv:math.pr/0402227. 

[5] Berger, N., Bollobas, B., Borgs, C, Chayes, J., and Riordan, O. (2003) De- 
gree distribution of the FKP model. Automata, Languages and Programming: 
30th International Colloquium, ICALP 2003, Lecture Notes in Computer Sci- 
ence 2719, eds. J. CM. Baeten, J.K. Lenstra, J. Parrow, and G.J. Woeginger, 
Springer, Heidelberg, 725-738. 

[6] Bhatt, A. G. and Roy, R. (2004) On a random directed spanning tree. Adv. 
App. Probab., 36, 19-42. 

[7] Hoare, C. A. R. (1961) Algorithm 64: Quicksort, Comms. of the ACM, 4, 321. 



57 



[8] Hwang, H.-K. (1998) Asymptotics of divide- and-conquer recurrences: Batcher's 
sorting algorithm and a minimum Euchdean matching heuristic, Algorithmica, 
22, 529-546. 

[9] Kesten, H. and Lee, S. (1996) The central limit theorem for weighted minimal 
spanning trees on random points. Ann. Appl. Probab. 6 495-527. 

[10] Kingman, J. F. C. (1993) Poisson Processes, Oxford Studies in Probability, 3, 
Oxford University Press, Oxford. 

[11] McLeish, D. L. (1974) Dependent central limit theorems and invariance prin- 
ciples, Ann. Probab., 2, 620-628. 

[12] Miles, R. E. (1970) On the homogeneous planar Poisson point process. Math. 
Biosci. 6, 85-127. 

[13] Neininger, R. and Riischendorf, L. (2004) A general limit theorem for recursive 
algorithms and combinatorial structures, Ann. App. Probab., 14, 378-418. 

[14] Penrose, M. (2003) Random Geometric Graphs, Oxford Studies in Probability, 
6, Clarendon Press, Oxford. 

[15] Penrose, M. D. (2004) Multivariate spatial central limit theorems with 
applications to percolation and spatial graphs, preprint available from 
http : //www . maths . bath . ac . uk/MATHEMATICS/preprints . html 

[16] Penrose, M. D. and Wade, A. R. (2004) Random minimal directed spanning 
trees and Dickman-type distributions. Adv. App. Probab, 36, 691-714. 

[17] Penrose, M. D. and Yukich, J. E. (2001) Central limit theorems for some graphs 
in computational geometry, Ann. Appl. Probab., 11, 1005-1041. 

[18] Penrose, M. D. and Yukich, J. E. (2003) Weak laws of large numbers in geo- 
metric probability, Ann. Appl. Probab., 13, 277-303. 

[19] Penrose, M. D. and Yukich, J. E. (2004) Normal approximation in geometric 
probability, preprint. 

[20] Rodriguez- Iturbe, I. and Rinaldo, A. (1997) Fractal River Basins: Chance and 
Self-Organization, Cambridge University Press, Cambridge. 

[21] Rosier, U. (1992) A fixed point theorem for distributions. Stochastic Process. 
Appl. 42, 195-214. 

[22] Rosier, U. and Riischendorf, L. (2001) The contraction method for recursive 
algorithms, Algorithmica, 29, 3-33. 

[23] Seppalainen, T. and Yukich, J. E. (2001) Large deviation principles for Eu- 
clidean functionals and other nearly additive processes, Probability Theory and 
Related Fields 120, 309-345. 

[24] Steele, J. M. (1997) Probability Theory and Combinatorial Optimization, So- 
ciety for Industrial and Applied Mathematics, Philadelphia. 

[25] Yukich, J. E. (1998) Probability Theory of Classical Euclidean Optimization 
Problems, Lecture Notes in Mathematics, 1675, Springer, Berlin. 



58 



